Collecting, Analyzing and Presenting data about the participation in #ilovefs day
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

wordcloud.md 2.7KB

Documentation: word_cloud.py

Table of Contents


Imports and Dependencies

The first import is the regular expressions library, which will be used in this case to divide the entire projects-strings we want to visualize into single words.

Next, we also need the random library to scramble the words in order to work around an issue in the wordcloud_cli.py script.

As a dependency of this script, we obviously need Python2, including the specified import libraries. Additionally, the script described here does not create the wordcloud itself, but prepares the text we afterwards can forward to wordcloud_cly.py.


Scrambling

In order to scramble the words, we define the scrambled() function. It simply takes a certain number of strings, scrambles their order with random.shuffle() and outputs the result:

  def scrambled(orig):
    dest = orig[:]
    random.shuffle(dest)
    return dest

In conjunction with the defined function get_words_from_string(), which splits a string into its individual words, the entire script boils down to:

  • split string of projects into individual words (projects)
  • scramble the order of the words (projects)
  • join the words together to a single string again
  • print() / output the resulting string

This is necessary, because the wordcloud_cly.py script may use several words as a single project otherwise, for example linux linux instead of just linux. Scrambling the words makes this effect extremely unlikely.


Wordcloud

Wordcloud, has a rather sparse documentation and since we did not write any of its code, but simply use it, we ommit discussing the project itself.

Important for our project is how we invoke the creation of the wordcloud. For this purpose, there’s a Makefile in the root-directory of this project:

  img:
	python word_cloud.py | wordcloud_cli.py --relative_scaling 0.6 \
	--imagefile graphics/word_cloud.png --width=2000 --height=2000 \
	--no_collocations --background="#ffffff"

Typing make in there, will invoke the word_cloud.py script, scrambling and exporting the names of the projects mentioned on ILoveFS Day and forward the resulting string to wordcloud_cli.py. As options, we choose a relative size-scale of 0.6 (0 is no size-scaling, 1 is maximum), a width and height of 2000 pixels and white (#FFFFFF) background. The --no_collocations argument gives us better spacing, but you may want to experiment with that.