Browse Source

added script to format exported supporter data for CiviCRM

svn path=/trunk/; revision=26430
tags/stw2018
Max Mehl 6 years ago
parent
commit
51da5ac52e

+ 1
- 1
support/export-support-as-csv-with-obscured-url-lkaf9847h59j7f4s5ds.php View File

@@ -19,7 +19,7 @@ along with this program. If not, see <http://www.gnu.org/licenses/>.
// This file is temporary - now I just need so see the db contents
// to develop email confirmation function

//die("This file is for debugging only.");
die("This file is for debugging only.");

$db = new PDO("sqlite:../../../db/support.sqlite");


+ 49
- 0
support/extract_and_import/README.txt View File

@@ -0,0 +1,49 @@
## GENERAL
By saving the data about supporters, we have following information about them:

id: ID number, more numbers than supporters because non-confirmed supporters are deleted AFAIK
time: Date and time of subscription
firstname: First Name of Supporter
lastname: Last Name of Supporter
email: Email address of supporter
country_code: Two digit country code of supporter (ISO-639-1 as it seems)
secret: unknown, seemes to be a MD5 hash
signed: unused, maybe a option to subscribe for newsletter
confirmed: Date and time of email confirmation
updated: unused AFAICS
ref_url: Referal URL
ref_id: Referal ID (like "mk" for Matthias or "google" for an Google url)
lang: Maybe language of browser?
reminder1: Date and time of 1st reminder mail?
reminder2: Date and time of 2nd reminder mail?
reminder3: Date and time of 3rd reminder mail?
zip: ZIP Code, unused
city: City, unused


## DOWNLOADING
For downloading the supporters file, follow these steps:
1. Modify the file /fsfe-web/trunk/support/export-support-as-csv-with-obscured-url-lkaf9847h59j7f4s5ds.php:
Change: //die("This file is for debugging only.");
to: die("This file is for debugging only.");

2. Commit the changes, wait for rebuild and open following url:
https://fsfe.org/support/export-support-as-csv-with-obscured-url-lkaf9847h59j7f4s5ds.php
The download should start now. If not, doublecheck step 1 or wait a bit longer (~10 min)

3. After downloading, revert step 1 and add // in front of the debugging file
if not, everybody would be able to download sensitive data.


## IMPORTING
If you want to import this data in CiviCRM, you need to format it first. For example, many names are lower/uppercase only or are empty. The Country Codes are not supported in CiviCRM as well. Additionally, some data is completely useless. If you have another feeling, feel free to change the script.

For this, you can use format-supporters.sh in addition to countries.txt. Just define a file to import (default: supporters.csv), the desired filename at output (default: supporters_format.csv) and the CountryCode-to-Country-Name file (default: countries.txt).

Now execute the shell script in your local terminal:
./format-supporters.sh

After ~5-10 minutes, all entries should be written to the output file. Please do not worry if the output file keeps empty until the end of the script: It writes the data to a temporary file during the execution process.


To understand what the script exactly does, please see the comments in the bash script.

+ 247
- 0
support/extract_and_import/countries.txt View File

@@ -0,0 +1,247 @@
# Kosovo is not available in CiviCRM, so XK = Serbia
# Netherlands Antilles is not available in CiviCRM, so AN = Netherlands
country_code:country
AD:Andorra
AE:United Arab Emirates
AF:Afghanistan
AG:Antigua and Barbuda
AI:Anguilla
AM:Armenia
AN:Netherlands
AO:Angola
AQ:Antarctica
AR:Argentina
AS:American Samoa
AT:Austria
AU:Australia
AW:Aruba
AX:Åland Islands
AZ:Azerbaijan
BA:Bosnia and Herzegovina
BB:Barbados
BD:Bangladesh
BE:Belgium
BF:Burkina Faso
BG:Bulgaria
BH:Bahrain
BI:Burundi
BJ:Benin
BM:Bermuda
BN:Brunei Darussalam
BO:Bolivia
BR:Brazil
BS:Bahamas
BT:Bhutan
BV:Bouvet Island
BW:Botswana
BY:Belarus
BZ:Belize
CA:Canada
CC:Cocos (Keeling) Islands
CD:Congo, The Democratic Republic of the
CF:Central African Republic
CG:Congo, Republic of the
CH:Switzerland
CI:Côte d'Ivoire
CK:Cook Islands
CL:Chile
CM:Cameroon
CN:China
CO:Colombia
CR:Costa Rica
CU:Cuba
CV:Cape Verde
CX:Christmas Island
CY:Cyprus
CZ:Czech Republic
DE:Germany
DJ:Djibouti
DK:Denmark
DM:Dominica
DO:Dominican Republic
DZ:Algeria
EC:Ecuador
EE:Estonia
EG:Egypt
EH:Western Sahara
ER:Eritrea
ES:Spain
ET:Ethiopia
FI:Finland
FJ:Fiji
FK:Falkland Islands (Malvinas)
FM:Micronesia, Federated States of
FO:Faroe Islands
FR:France
GA:Gabon
GB:United Kingdom
GD:Grenada
GE:Georgia
GF:French Guiana
GG:Guernsey
GH:Ghana
GI:Gibraltar
GL:Greenland
GM:Gambia
GN:Guinea
GP:Guadeloupe
GQ:Equatorial Guinea
GR:Greece
GS:South Georgia and the South Sandwich Islands
GT:Guatemala
GU:Guam
GW:Guinea-Bissau
GY:Guyana
HK:Hong Kong
HM:Heard Island and McDonald Islands
HN:Honduras
HR:Croatia
HT:Haiti
HU:Hungary
ID:Indonesia
IE:Ireland
IL:Israel
IM:Isle of Man
IN:India
IO:British Indian Ocean Territory
IQ:Iraq
IR:Iran, Islamic Republic of
IS:Iceland
IT:Italy
JE:Jersey
JM:Jamaica
JO:Jordan
JP:Japan
KE:Kenya
KG:Kyrgyzstan
KH:Cambodia
KI:Kiribati
KM:Comoros
KN:Saint Kitts and Nevis
KP:Korea, Democratic People's Republic of
KR:Korea, Republic of
KW:Kuwait
KY:Cayman Islands
KZ:Kazakhstan
LA:Lao People's Democratic Republic
LB:Lebanon
LC:Saint Lucia
LI:Liechtenstein
LK:Sri Lanka
LR:Liberia
LS:Lesotho
LT:Lithuania
LU:Luxembourg
LV:Latvia
LY:Libya
MA:Morocco
MC:Monaco
MD:Moldova
ME:Montenegro
MG:Madagascar
MH:Marshall Islands
MK:Macedonia, Republic of
ML:Mali
MM:Myanmar
MN:Mongolia
MO:Macao
MP:Northern Mariana Islands
MQ:Martinique
MR:Mauritania
MS:Montserrat
MT:Malta
MU:Mauritius
MV:Maldives
MW:Malawi
MX:Mexico
MY:Malaysia
MZ:Mozambique
NA:Namibia
NC:New Caledonia
NE:Niger
NF:Norfolk Island
NG:Nigeria
NI:Nicaragua
NL:Netherlands
NO:Norway
NP:Nepal
NR:Nauru
NU:Niue
NZ:New Zealand
OM:Oman
PA:Panama
PE:Peru
PF:French Polynesia
PG:Papua New Guinea
PH:Philippines
PK:Pakistan
PL:Poland
PM:Saint Pierre and Miquelon
PN:Pitcairn
PR:Puerto Rico
PS:Palestinian Territory, Occupied
PT:Portugal
PW:Palau
PY:Paraguay
QA:Qatar
RE:Reunion
RO:Romania
RS:Serbia
RU:Russian Federation
RW:Rwanda
SA:Saudi Arabia
SB:Solomon Islands
SC:Seychelles
SD:Sudan
SE:Sweden
SG:Singapore
SH:Saint Helena
SI:Slovenia
SJ:Svalbard and Jan Mayen
SK:Slovakia
SL:Sierra Leone
SM:San Marino
SN:Senegal
SO:Somalia
SR:Suriname
ST:Sao Tome and Principe
SV:El Salvador
SY:Syrian Arab Republic
SZ:Swaziland
TC:Turks and Caicos Islands
TD:Chad
TF:French Southern Territories
TG:Togo
TH:Thailand
TJ:Tajikistan
TK:Tokelau
TL:Timor-Leste
TM:Turkmenistan
TN:Tunisia
TO:Tonga
TR:Turkey
TT:Trinidad and Tobago
TV:Tuvalu
TW:Taiwan
TZ:Tanzania, United Republic of
UA:Ukraine
UG:Uganda
UM:United States Minor Outlying Islands
US:United States
UY:Uruguay
UZ:Uzbekistan
VA:Holy See (Vatican City State)
VC:Saint Vincent and the Grenadines
VE:Venezuela
VG:Virgin Islands, British
VI:Virgin Islands, U.S.
VN:Viet Nam
VU:Vanuatu
WF:Wallis and Futuna
WS:Samoa
XK:Serbia
YE:Yemen
YT:Mayotte
ZA:South Africa
ZM:Zambia
ZW:Zimbabwe

+ 130
- 0
support/extract_and_import/format-supporters.sh View File

@@ -0,0 +1,130 @@
#!/bin/bash


# Please read README.txt before executing this file
# This script is able to format the exported supporter file (.csv) so it can be imported in CiviCRM directly
# It only takes the First Name, Last Name, Email Address and the Country because everything else seemed to be of no interest for further campaigning
# Of course you can change that easily but please be aware that CiviCRM has fields like "ref_url" not by default
# This script is by far not perfect but does what it should do. For 3500 supporters, it needed ~5 minutes

# Written by Max Mehl <max.mehl@fsfe.org> for Free Software Foundation Europe
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as
# published by the Free Software Foundation, either version 3 of the
# License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.


# CHANGE these files if needed
INPUT=supporters.csv
OUTPUT=supporters_format.csv
COUNTRIES=countries.txt

## FROM HERE you (hopefully) do not need to change something
# empty output file
> $OUTPUT
> temp131031.txt
> temp131031_new.txt

# Makes all first letters of max. 3 names uppercase and the rest lowercase
function nameformat {
ORIGINALNAME=$1 # Name which should be formatted, should be given by function call
t=1 # start with the first name
AWK="1"
while [ "$AWK" != "" ]; do # only runs if there is still a name

NAME=$(echo $ORIGINALNAME | awk -F\ '{print $'$t'}') # takes out first part of name

NAME=$(echo $NAME | sed "s:[A-Z]:\L&:g") # make all letters lowercase
NAME_1l=$(echo $NAME | sed -e 's/\([a-z]\).*/\1/') # first lowercase letter
NAME_REST=$(echo $NAME | sed -e 's/^'$NAME_1l'//' ) # rest of lowercase letters remain unchanged
NAME_1u=$(echo $NAME_1l | tr [a-z] [A-Z] ) # make first letter uppercase
NAME=$NAME_1u$NAME_REST # put strings together
NAME[$t]=$NAME # Put the formatted name in a different NAME variable each loop
(( t++ )) # Counts 1 up for next part of name
AWK=$(echo $line | awk -F\ '{print $'$t'}') # reads second name, if existing
done

CORRECTNAME="${NAME[1]}" # if there is only one name
if [ "${NAME[3]}" != "" ]; then # if there are 3 names, every additional name is lowercase only
CORRECTNAME=""${NAME[1]}" "${NAME[2]}" "${NAME[3]}""
elif [ "${NAME[2]}" != "" ]; then # if there are 2 names
CORRECTNAME=""${NAME[1]}" "${NAME[2]}""
fi
}


while read line
do
# erase all "", will be added later
line=$(echo $line | sed 's/"//g')
#ID=$(echo $line | awk -F, '{ print $1 }')
#DATE=$(echo $line | awk -F, '{ print $2 }')
FIRSTNAME=$(echo $line | awk -F, '{ print $3 }')
LASTNAME=$(echo $line | awk -F, '{ print $4 }')
EMAIL=$(echo $line | awk -F, '{ print $5 }')
CCODE=$(echo $line | awk -F, '{ print $6 }')
#SECRET=$(echo $line | awk -F, '{ print $7 }')
#SIGNED=$(echo $line | awk -F, '{ print $8 }')
#CONFIRMDATE=$(echo $line | awk -F, '{ print $9 }')
#UPDATEDATE=$(echo $line | awk -F, '{ print $10 }')
#REFURL=$(echo $line | awk -F, '{ print $11 }')
#REFID=$(echo $line | awk -F, '{ print $12 }')
# DATE: erase time, only keep date
DATE=$(echo $DATE | awk -F\ '{ print $1 }')

# FIRSTNAME: (only) first letters uppercase
nameformat "$FIRSTNAME"
FIRSTNAME="$CORRECTNAME"
# LASTNAME: (only) first letters uppercase
nameformat "$LASTNAME"
LASTNAME="$CORRECTNAME"
# EMAIL: all letters lowercase
EMAIL=$(echo $EMAIL | sed "s:[A-Z]:\L&:g") # make all letters lowercase

## Replace Country Code with full Country name
## FAR TOO SLOW!!!
#while read line
#do
#GREP=$(echo "$line" | grep "$CCODE")
#if [ $? = 0 ]; then
#COUNTRY=$(echo "$line" | awk -F: '{print $2}')
#fi
#done <"$COUNTRIES"
# Output of all interesting strings with "" surrounded, only if Firstname not empty
if [ "$FIRSTNAME" != "" ]; then
echo "\"$FIRSTNAME\",\"$LASTNAME\",$EMAIL,\"$CCODE\"" >> temp131031.txt
fi
done <"$INPUT"

# Replaces all Country Codes with the Full Country name used in CiviCRM
while read line
do
CCODE=$(echo $line | awk -F: '{print $1}')
CNAME=$(echo $line | awk -F: '{print $2}')
sed s/"\"$CCODE\""/"\"$CNAME\""/g temp131031.txt > temp131031_new.txt
mv temp131031_new.txt temp131031.txt
done <"$COUNTRIES"

mv temp131031.txt $OUTPUT




Loading…
Cancel
Save