|
@@ -0,0 +1,74 @@
|
|
1
|
+# Documentation: [word_cloud.py](../word_cloud.py)
|
|
2
|
+
|
|
3
|
+## Table of Contents
|
|
4
|
+* [Imports and Dependencies](#imports-and-dependencies)
|
|
5
|
+* [Word Scrambling](#scrambling)
|
|
6
|
+* [Creating the Wordcloud](#wordcloud)
|
|
7
|
+
|
|
8
|
+* * *
|
|
9
|
+
|
|
10
|
+## Imports and Dependencies
|
|
11
|
+
|
|
12
|
+The first import is the
|
|
13
|
+[regular expressions library](https://docs.python.org/2/library/re.html), which
|
|
14
|
+will be used in this case to divide the entire projects-strings we want to
|
|
15
|
+visualize into single words.
|
|
16
|
+
|
|
17
|
+Next, we also need the
|
|
18
|
+[random library](https://docs.python.org/2/library/random.html) to scramble the
|
|
19
|
+words in order to work around an issue in the `wordcloud_cli.py` script.
|
|
20
|
+
|
|
21
|
+As a dependency of this script, we obviously need Python2, including the
|
|
22
|
+specified import libraries. Additionally, the script described here does not
|
|
23
|
+create the wordcloud itself, but prepares the text we afterwards can forward to
|
|
24
|
+[wordcloud_cly.py](https://github.com/amueller/word_cloud).
|
|
25
|
+
|
|
26
|
+* * *
|
|
27
|
+
|
|
28
|
+## Scrambling
|
|
29
|
+
|
|
30
|
+In order to scramble the words, we define the `scrambled()` function. It simply
|
|
31
|
+takes a certain number of strings, scrambles their order with `random.shuffle()`
|
|
32
|
+and outputs the result:
|
|
33
|
+```
|
|
34
|
+ def scrambled(orig):
|
|
35
|
+ dest = orig[:]
|
|
36
|
+ random.shuffle(dest)
|
|
37
|
+ return dest
|
|
38
|
+```
|
|
39
|
+
|
|
40
|
+In conjunction with the defined function `get_words_from_string()`, which splits
|
|
41
|
+a string into its individual words, the entire script boils down to:
|
|
42
|
+
|
|
43
|
+* split string of projects into individual words (projects)
|
|
44
|
+* scramble the order of the words (projects)
|
|
45
|
+* join the words together to a single string again
|
|
46
|
+* `print()` / output the resulting string
|
|
47
|
+
|
|
48
|
+This is necessary, because the `wordcloud_cly.py` script may use several words
|
|
49
|
+as a single project otherwise, for example `linux linux` instead of just
|
|
50
|
+`linux`. Scrambling the words makes this effect extremely unlikely.
|
|
51
|
+
|
|
52
|
+* * *
|
|
53
|
+
|
|
54
|
+## Wordcloud
|
|
55
|
+
|
|
56
|
+[Wordcloud](http://amueller.github.io/word_cloud/index.html), has a rather
|
|
57
|
+sparse documentation and since we did not write any of its code, but simply use
|
|
58
|
+it, we ommit discussing the project itself.
|
|
59
|
+
|
|
60
|
+Important for our project is how we invoke the creation of the wordcloud. For
|
|
61
|
+this purpose, there's a `Makefile` in the root-directory of this project:
|
|
62
|
+```
|
|
63
|
+ img:
|
|
64
|
+ python word_cloud.py | wordcloud_cli.py --relative_scaling 0.6 \
|
|
65
|
+ --imagefile graphics/word_cloud.png --width=2000 --height=2000 \
|
|
66
|
+ --no_collocations --background="#ffffff"
|
|
67
|
+```
|
|
68
|
+
|
|
69
|
+Typing `make` in there, will invoke the word_cloud.py script, scrambling and
|
|
70
|
+exporting the names of the projects mentioned on ILoveFS Day and forward the
|
|
71
|
+resulting string to `wordcloud_cli.py`. As options, we choose a relative
|
|
72
|
+size-scale of 0.6 (0 is no size-scaling, 1 is maximum), a width and height of
|
|
73
|
+2000 pixels and white (`#FFFFFF`) background. The `--no_collocations` argument
|
|
74
|
+gives us better spacing, but you may want to experiment with that.
|