Browse Source

Edit: Finishing documentation of plotte.R script

janwey 1 year ago
parent
commit
14a3bd884f
2 changed files with 77 additions and 0 deletions
  1. 77
    0
      docs/plotter.md
  2. BIN
      docs/plotter.pdf

+ 77
- 0
docs/plotter.md View File

@@ -5,6 +5,8 @@
5 5
 * [Packages used](#packages)
6 6
   * [The ggplot2 package](#the-ggplot2-package)
7 7
   * [The gridExtra package](#the-gridextra-package)
8
+* [Plot: Participation by Platform](#participation-by-platform)
9
+* [Plot: Participation by Time](#participation-by-time)
8 10
 
9 11
 * * *
10 12
 
@@ -97,3 +99,78 @@ well as `legend()` to provide some extra information, necessary to understand th
97 99
 graphic. Our first plot uses the previously constructed `platform` variable as
98 100
 input. Since this is a factor variable, R will automatically create a barplot
99 101
 from this data. For the color, we use red for the Fediverse and blue for Twitter.
102
+The y-axes limit is a construction which *should* work in the future as well, but
103
+may need some minor adaption eventually. In its current form, it looks for the
104
+highest occurrence of tweets on a platform and rounds it up to the next higher
105
+100 (110 would become 200, 401 would become 500).
106
+```
107
+  ylim = c(0, ceiling(max(table(platform))/100) * 100)
108
+```
109
+
110
+The color in the second plot is generated with the `rainbow()` function, which
111
+will simply output a full color-spectrum starting from and ending with *red*
112
+again:
113
+```
114
+  col = rainbow(n = length(unique(instances)))
115
+```
116
+
117
+In order to have both plots in a single graphic next to each other, `par()` can
118
+be used prior to plotting. The argument in use here generates a grid with one
119
+line and two columns:
120
+```
121
+  par(mfrow=c(1,2))
122
+```
123
+
124
+By issuing `pdf()` prior to plotting and `dev.off()` afterwards, we can export
125
+the graphic directly to a PDF file (vectorized).
126
+
127
+* * *
128
+
129
+## Participation by Time
130
+
131
+For our second graphic we use functions from the `ggplot2` package instead of
132
+R's default `plot()` function, simply because `ggplot2` is much better with
133
+timeseries data.
134
+
135
+Before doing so, we first have to create the timeseries for which we use the
136
+`date` and `time` variables in our datasets and the `strptime()`. Connecting
137
+these strings with `paste0()` (`paste()` would create a space bewteen both
138
+strings), they have the form: `YYYYMMDDhhmmss` which we also specify as the
139
+`format` argument:
140
+```
141
+  twitter_time <- strptime(paste0(twitter$date, twitter$time),
142
+                           format = "%Y%m%d%H%M%S")
143
+  mastodon_time <- strptime(paste0(mastodon$date, mastodon$time),
144
+                            format = "%Y%m%d%H%M%S")
145
+```
146
+
147
+`ggplot2` has a rather unconventional syntax for R functions. You can combine
148
+several ggplot-functions with a `+`, enabling you to create extremely complex
149
+plots rather easily. The `ggplot()` function itself only initializes the plot
150
+object and specifies the data we are going to use. We combine (`+`) this function
151
+with `geom_histogram()` which - as the name might imply - creates the histogram
152
+itself. `scale_x_datetime()` and `scale_y_continuous()` specify the x-axes as
153
+timeline and y-axes as an index (counting continously). Lastly, we can specify a
154
+title of the plot with `ggtitle()` and fill the bars of the plot with a gradient
155
+color from low to high with `scale__fill_gradient()`. In the case of twitter, we
156
+save the entire plot into the `twitter_plot` variable for later use and do so
157
+similarly with mastodon/the fediverse:
158
+```
159
+  twitter_plot <- ggplot(data = twitter, aes(x=twitter_time)) +
160
+    geom_histogram(aes(fill=..count..), binwidth=60*180) + 
161
+    scale_x_datetime("Date") + 
162
+    scale_y_continuous("Frequency") +
163
+    ggtitle("Participation on Twitter") +
164
+    scale_fill_gradient("Count", low="#002864", high="#329cc3")
165
+```
166
+
167
+As opposed to R's `plot()` function, we can not use `par()` to create a unified
168
+graphic with `ggplot2`. Instead we use `grid.arrange()` from the `gridExtra`
169
+package. It takes both plots as arguments, as well as the number of columns it
170
+should arrange them in. With `pdf()` and `dev.off()` we can save the graphic
171
+into a PDF file (vectorized) directly:
172
+```
173
+  pdf(file="./plots/ilfs-participation-by-date.pdf", width=14, height=7)
174
+  grid.arrange(twitter_plot, mastodon_plot, ncol = 2)
175
+  dev.off()
176
+```

BIN
docs/plotter.pdf View File


Loading…
Cancel
Save