Visualization with gourse
Recently I made this visualization for contributors to https://l10n.org.al for the last two years (2015-2013). This visualization was done with gource. It was a bit of hacking, so I would like to describe here how I did it.
Getting The Data To Be Visualized
For each translation or vote on https://l10n.org.al/ the time and the name of the author is saved as well. So, we would like to get this information from the database, as well as the project to which this translation belonged, for all votes and translations.
First have a look at this diagram, just to get an idea of the structure of the database:
Then look at the script below which I used to extract the data.
#!/bin/bash
query="
SELECT v.time, u.name, (v.time=t.time) AS new, project, origin
FROM votes v
LEFT JOIN users u ON (u.ulng = v.ulng AND u.umail = v.umail)
LEFT JOIN translations t ON (t.tguid = v.tguid)
LEFT JOIN strings s ON (s.sguid = t.sguid)
LEFT JOIN locations l ON (l.sguid = s.sguid)
LEFT JOIN templates tpl ON (tpl.potid = l.potid)
LEFT JOIN projects p ON (p.pguid = tpl.pguid)
"
mysql -u root -D btranslator_data -B -e "$query" > contrib.txt
This is simply a join of the tables and extracting the fields that are needed. When the time of vote is the same as the time of translation, then this is a new translation and the field new
is true.
For each translation or vote on https://l10n.org.al/ the time and the name of the author is saved as well. So, we would like to get this information from the database, as well as the project to which this translation belonged, for all votes and translations.
First have a look at this diagram, just to get an idea of the structure of the database:
Then look at the script below which I used to extract the data.
First have a look at this diagram, just to get an idea of the structure of the database:
Then look at the script below which I used to extract the data.
#!/bin/bashThis is simply a join of the tables and extracting the fields that are needed. When the time of vote is the same as the time of translation, then this is a new translation and the field
query="
SELECT v.time, u.name, (v.time=t.time) AS new, project, origin
FROM votes v
LEFT JOIN users u ON (u.ulng = v.ulng AND u.umail = v.umail)
LEFT JOIN translations t ON (t.tguid = v.tguid)
LEFT JOIN strings s ON (s.sguid = t.sguid)
LEFT JOIN locations l ON (l.sguid = s.sguid)
LEFT JOIN templates tpl ON (tpl.potid = l.potid)
LEFT JOIN projects p ON (p.pguid = tpl.pguid)
"
mysql -u root -D btranslator_data -B -e "$query" > contrib.txt
new
is true.Transforming Data To The Right Format
In the result file contrib.txt
fields are separated by TAB, so first I replaced tabs with commas (,) for easy processing. I did it with find/replace on Emacs, but any other editor can do this.
Gource expects the input file (which is called log file) in the format timestamp|username|action|filename
. Gource was designed to work with version control systems (like git, subversion, etc.) in order to visualize the project activity, so theaction
is expected to be A, or M, or D etc. (respectively for adding, modifying, deleting a file) and filename
is the file that was touched.
I had to transform the data to this format and I did it with a script like this:
#!/bin/bash
while read l
do
time=$(echo $l | cut -d, -f1)
name=$(echo $l | cut -d, -f2)
new=$(echo $l | cut -d, -f3)
project=$(echo $l | cut -d, -f4)
origin=$(echo $l | cut -d, -f5)
timestamp=$(date -d "$time" +%s) # convert to timestamp
if [ "$new" = "1" ]; then action='A'; else action='M'; fi
path="/$origin/$project"
echo "$timestamp|$name|$action|$path"
done < contrib.txt > contrib.log
I also made sure that there are no lines containing NULL and that the file is sorted in increasing order by the timestamp:
sed -i contrib.log -e '/NULL/d'
sort -t'|' -k1 contrib.log > contrib-1.log
rm contrib.log
mv contrib-1.log contrib.log
In the result file
Gource expects the input file (which is called log file) in the format
I had to transform the data to this format and I did it with a script like this:
contrib.txt
fields are separated by TAB, so first I replaced tabs with commas (,) for easy processing. I did it with find/replace on Emacs, but any other editor can do this.Gource expects the input file (which is called log file) in the format
timestamp|username|action|filename
. Gource was designed to work with version control systems (like git, subversion, etc.) in order to visualize the project activity, so theaction
is expected to be A, or M, or D etc. (respectively for adding, modifying, deleting a file) and filename
is the file that was touched.I had to transform the data to this format and I did it with a script like this:
#!/bin/bashI also made sure that there are no lines containing NULL and that the file is sorted in increasing order by the timestamp:
while read l
do
time=$(echo $l | cut -d, -f1)
name=$(echo $l | cut -d, -f2)
new=$(echo $l | cut -d, -f3)
project=$(echo $l | cut -d, -f4)
origin=$(echo $l | cut -d, -f5)
timestamp=$(date -d "$time" +%s) # convert to timestamp
if [ "$new" = "1" ]; then action='A'; else action='M'; fi
path="/$origin/$project"
echo "$timestamp|$name|$action|$path"
done < contrib.txt > contrib.log
sed -i contrib.log -e '/NULL/d'
sort -t'|' -k1 contrib.log > contrib-1.log
rm contrib.log
mv contrib-1.log contrib.log
Feeding The Data To Gource And Generating The Output
First make sure that the tools that we need are installed:
sudo apt-get install gource ffmpeg
Then generate the video with a script like this:
#!/bin/bash
gource contrib.log -s 0.1 -i 0 --max-files 0 \
--date-format "%B %Y" --hide dirnames,filenames \
-640x360 -o - \
| ffmpeg -y -r 25 -f image2pipe -vcodec ppm -i - \
-vcodec libvpx -b 10000K contrib.webm
How did I know that these are the right options? I didn't know (especially for ffmpeg
), I simply googled and found some examples, then I played with some options of gource
in order to get it right.
First make sure that the tools that we need are installed:
sudo apt-get install gource ffmpegThen generate the video with a script like this:
#!/bin/bashHow did I know that these are the right options? I didn't know (especially for
gource contrib.log -s 0.1 -i 0 --max-files 0 \
--date-format "%B %Y" --hide dirnames,filenames \
-640x360 -o - \
| ffmpeg -y -r 25 -f image2pipe -vcodec ppm -i - \
-vcodec libvpx -b 10000K contrib.webm
ffmpeg
), I simply googled and found some examples, then I played with some options of gource
in order to get it right.Using NGINX as a Web Server for Drupal
Nginx (engine-x) is a web server that is regarded to be faster than Apache and with a better performance on heavy load. The difference is summed up succinctly in a quote by Chris Lea on the Why Use Nginx? page: "Apache is like Microsoft Word, it has a million options but you only need six. Nginx does those six things, and it does five of them 50 times faster than Apache."
Technically speaking, Apache is a process-and-thread-driven application, while Nginx is event-driven. In practice this means that Nginx needs much less memory than Apache to do the work, and also can work faster. There are claims that Nginx, working in a server of 512MB RAM, can handle 10,000 (yes, ten thousands) concurrent requests without problem, while Apache with such a load would just commit harakiri (suicide). Besides, the configuration of Nginx, once you get used to it, is simpler and more intuitive than that of Apache.
It seemed like something that I should definitely give a try, since my web server already had performance problems and I cannot afford to pay for increasing its capacity. Here I describe the steps for installing and configuring Nginx to suit the needs of my web application (which is based on Drupal7, running on a 512MB RAM server at Rackspace).
Table of Contents
- Installing nginx and php5-fpm
- Configuring php5-fpm
- Configuring nginx
- Configuration for phpMyAdmin
- SSL (HTTPS) support
- Avoid any DOS attacks
- Full configuration of the site
1. Installing nginx and php5-fpm
In ubuntu server this is very easy:
sudo apt-get install nginx nginx-doc php5-fpm
update-rc.d apache2 disable
update-rc.d nginx enable
service apache2 stop
service nginx start
2. Configuring php5-fpm
The main config file (
/etc/php5/fpm/php-fpm.conf
) did not need to be changed at all.On the pool configuration file (
/etc/php5/fpm/pool.d/www.conf
) I made only some small modifications:- Listen to a unix socket, instead if a TCP socket:
;listen = 127.0.0.1:9000
listen = /var/run/php-fpm.sock - Other modified options:
pm.max_requests = 5000
php_flag[display_errors] = on
php_admin_value[memory_limit] = 128M
php_admin_value[max_execution_time] = 90
I also made these modifications on
/etc/php5/fpm/php.ini
:cgi.fix_pathinfo=0
max_execution_time = 90
display_errors = On
post_max_size = 16M
upload_max_filesize = 16M
default_socket_timeout = 90
Finally restarted the service php5-fpm:
service php5-fpm restart
3. Configuring nginx
On ubuntu, the configuration of Nginx is located at
/etc/nginx/
.- Create a configuration file for the website, based on the drupal example configuration file:
cd /etc/nginx/sites-available/
cp /usr/share/doc/nginx-doc/examples/drupal.gz .
gunzip drupal.gz
mv drupal btranslator_dev
cd /etc/nginx/sites-enabled/
ln -s ../sites-available/btranslator_dev . - At
/etc/nginx/sites-enabled/btranslator_dev
modify servername and root, and also add accesslog and errorlog:server_name dev.btranslator.org l10n-dev.org.al;
root /var/www/dev.btranslator.org;
access_log /var/log/nginx/btranslator_dev.access.log;
error_log /var/log/nginx/btranslator_dev.error.log info; - At
/etc/nginx/sites-enabled/btranslator_dev
, modify the name of the unix socket at the fastcgipass line:location ~ \.php$ {
fastcgi_split_path_info ^(.+\.php)(/.+)$;
include fastcgi_params;
# Intercepting errors will cause PHP errors to appear in Nginx logs
fastcgi_intercept_errors on;
fastcgi_pass unix:/var/run/php-fpm.sock;
} - At
/etc/nginx/sites-enabled/btranslator_dev
, add the index line as well, at the root location:location / {
index index.php;
try_files $uri $uri/ @rewrite;
} - At
/etc/nginx/sites-enabled/btranslator_dev
, allow only localhost to access txt and log files:location ~* \.(txt|log)$ {
allow 127.0.0.1;
deny all;
} - At
/etc/nginx/nginx.conf
, decrease worker processes to 1 or 2:# worker_processes 4;
worker_processes 2;
These modifications are all we need, and then we can reload or restart the nginx service:
service nginx restart
4. Configuration for phpMyAdmin
Add these lines inside the server section, at
/etc/nginx/sites-enabled/btranslator_dev
:# Configuration for phpMyAdmin
location /phpmyadmin {
root /usr/share/;
index index.php index.html index.htm;
location ~ ^/phpmyadmin/(.+\.php)$ {
try_files $uri =404;
root /usr/share/;
fastcgi_pass unix:/var/run/php-fpm.sock;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include /etc/nginx/fastcgi_params;
}
location ~* ^/phpmyadmin/(.+\.(jpg|jpeg|gif|css|png|js|ico|html|xml|txt))$ {
root /usr/share/;
}
}
location /phpMyAdmin {
rewrite ^/* /phpmyadmin last;
}
Then reload the nginx service.
5. SSL (HTTPS) support
Add these lines at
/etc/nginx/sites-enabled/btranslator_dev
:server {
listen 80;
listen 443 ssl;
ssl_certificate /etc/ssl/certs/ssl-cert-snakeoil.pem;
ssl_certificate_key /etc/ssl/private/ssl-cert-snakeoil.key;
. . . . .
}
Since SSL connections have some overhead, to make them more efficient, add these lines as well at
/etc/nginx/nginx.conf
(in order to increase session timeout and and use less expensive encryption):http {
. . . . .
#keepalive_timeout 65;
keepalive_requests 50;
keepalive_timeout 300;
## Global SSL options
ssl_ciphers HIGH:!aNULL:!MD5:!kEDH;
ssl_prefer_server_ciphers on;
ssl_protocols TLSv1;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
. . . . .
}
Then reload nginx.
6. Avoid any DOS attacks
In order to avoid any DOS attacks, add these lines at
/etc/nginx/nginx.conf
http {
. . . . .
## limit request frequency to 2 requests per second
limit_req_zone $binary_remote_addr zone=one:10m rate=2r/s;
limit_req zone=one burst=5;
. . . . .
}
7. Full configuration of the site
A full version of the file
/etc/nginx/sites-enabled/btranslator_dev
looks like this:server {
listen 80;
listen 443 ssl;
ssl_certificate /etc/ssl/certs/ssl-cert-snakeoil.pem;
ssl_certificate_key /etc/ssl/private/ssl-cert-snakeoil.key;
server_name dev.btranslator.org l10n-dev.org.al;
root /var/www-ssl/dev.btranslator.org;
access_log /var/log/nginx/btranslator_dev.access.log;
error_log /var/log/nginx/btranslator_dev.error.log info;
location = /favicon.ico {
log_not_found off;
access_log off;
}
location = /robots.txt {
allow all;
log_not_found off;
access_log off;
}
# This matters if you use drush
location = /backup {
deny all;
}
# Very rarely should these ever be accessed outside of your lan
location ~* \.(txt|log)$ {
allow 127.0.0.1;
deny all;
}
# This location block protects against a known attack.
location ~ \..*/.*\.php$ {
return 403;
}
# This is our primary location block.
location / {
index index.php;
try_files $uri $uri/ @rewrite;
expires max;
}
# This will rewrite our request from domain.com/node/1/ to domain.com/index.php?q=node/1
# This could be done in try_files without a rewrite however, the GlobalRedirect
# module enforces no slash (/) at the end of URL's. This rewrite removes that
# so no infinite redirect loop is reached.
location @rewrite {
rewrite ^/(.*)$ /index.php?q=$1;
}
# If a PHP file is served, this block will handle the request. This block
# works on the assumption you are using php-cgi listening on /tmp/phpcgi.socket.
# Please see the php example (usr/share/doc/nginx/exmaples/php) for more
# information about setting up PHP.
# NOTE: You should have "cgi.fix_pathinfo = 0;" in php.ini
location ~ \.php$ {
fastcgi_split_path_info ^(.+\.php)(/.+)$;
include fastcgi_params;
# Intercepting errors will cause PHP errors to appear in Nginx logs
fastcgi_intercept_errors on;
fastcgi_pass unix:/var/run/php-fpm.sock;
}
# The ImageCache module builds an image 'on the fly' which means that
# if it doesn't exist, it needs to be created. Nginx is king of static
# so there's no point in letting PHP decide if it needs to be servered
# from an existing file.
# If the image can't be served directly, it's assumed that it doesn't
# exist and is passed off to PHP via our previous rewrite to let PHP
# create and serve the image.
# Notice that try_files does not have $uri/ in it. This is because an
# image should never be a directory. So there's no point in wasting a
# stat to serve it that way.
location ~ ^/sites/.*/files/imagecache/ {
try_files $uri @rewrite;
}
# As mentioned above, Nignx is king of static. If we're serving a static
# file that ends with one of the following extensions, it is best to set
# a very high expires time. This will generate fewer requests for the
# file. These requests will be logged if found, but not if they don't
# exist.
location ~* \.(js|css|png|jpg|jpeg|gif|ico)$ {
expires max;
log_not_found off;
}
# Configuration for phpMyAdmin
location /phpmyadmin {
root /usr/share/;
index index.php index.html index.htm;
location ~ ^/phpmyadmin/(.+\.php)$ {
try_files $uri =404;
root /usr/share/;
fastcgi_pass unix:/var/run/php-fpm.sock;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include /etc/nginx/fastcgi_params;
}
location ~* ^/phpmyadmin/(.+\.(jpg|jpeg|gif|css|png|js|ico|html|xml|txt))$ {
root /usr/share/;
}
}
location /phpMyAdmin {
rewrite ^/* /phpmyadmin last;
}
}
8. Referencies:
- http://arstechnica.com/business/2011/11/a-faster-web-server-ripping-out-apache-for-nginx/
- http://blog.celogeek.com/201209/202/how-to-configure-nginx-php-fpm-drupal-7-0/
- http://insready.com/blog/build-nginx-php-fpm-apc-memcache-drupal-7-bare-bone-ubuntu-1004-or-debian-5-server
- http://groups.drupal.org/node/238983
- http://groups.drupal.org/nginx
- http://www.howtoforge.com/running-phpmyadmin-on-nginx-lemp-on-debian-squeeze-ubuntu-11.04
- http://nginx.org/en/docs/http/configuring_https_servers.html
- http://wiki.nginx.org/HttpSslModule
- http://wiki.nginx.org/HttpLimitReqModule
- http://matt.io/technobabble/hivemind_devops_alert:_nginx_does_not_suck_at_ssl/ur
Installing a Clonezilla Server
Clonezilla Server is used to clone many computers simultaneously across a network. This is done using a DRBL server and computer workstations that can boot from a network.
1 Set a static IP
Ubuntu by default uses network-manager and automatic (DHCP) configuration for the network card. For a server it better to do manual static (fixed IP) configuration.
First modify
First modify
/etc/network/interfaces
like this:auto loThen remove the package
iface lo inet loopback
auto eth0
iface eth0 inet static
address 192.168.1.235
netmask 255.255.255.0
gateway 192.168.1.1
network-manager
and restart networking:aptitude purge network-manager
/etc/init.d/networking restart
2 Add a second IP address (alias) to the same card
Conezilla is based on the DRBL server (which is a kind of light termal server, whith terminals booting over the network through PXE). The DRBL server itself needs one external network interface (connected to WAN) and at least one internal interface (connected to LAN). However, if we have just one network interface, we can add an alias to it, and use it for the LAN.
Append these lines to
Reference:
Append these lines to
/etc/network/interfaces
:auto eth0:0Then restart networking:
iface eth0:0 inet static
name Ethernet alias LAN card
address 192.168.3.235
netmask 255.255.255.0
broadcast 192.168.3.255
network 192.168.3.0
/etc/init.d/networking restart
Reference:
3 Installation on the server
Here are the steps for installing DRBL/Clonezilla on a Ubuntu Server:
- Add the key of the DRBL repository to apt:
wget -q http://drbl.org/GPG-KEY-DRBL -O- | sudo apt-key add -
- Create the file
/etc/apt/sources.list.d/drbl.list
which contains this line:deb http://drbl.sourceforge.net/drbl-core drbl stable
- Install the package
drbl
:apt-get update
apt-get install drbl - Install the DRBL server:
/opt/drbl/sbin/drblsrv -i
For the installation steps see this example: http://drbl.sourceforge.net/one4all/examples/drblsrv\_desktop\_example.txt - Setup the filesystem for the clients
/opt/drbl/sbin/drblpush -i
- http://www.upubuntu.com/2012/05/how-to-install-clonezilla-server-on.html
- https://help.ubuntu.com/community/Clonezilla\_Server\_Edition
- http://geekyprojects.com/cloning/setup-a-clonezilla-server-on-ubuntu/
- http://drbl.sourceforge.net/one4all/
- http://drbl.nchc.org.tw/one4all/desktop/download/stable/RELEASE-NOTES
The Digital Signature and the X.509/OpenPGP Authentication Models
This article explains what is a Digital Signature, why it is an important part of the Digital Identity, and how it works. Then it describes the authenticity and social problems related to the usage of the Digital Signature. It explains as well the two authentication models, X.509 and OpenPGP, that can be used to solve these authenticity problems. Finally it makes a comparison between these two authentication models and their features and tries to explain why the OpenPGP model is better.
Table of Contents
- 1 Introduction
- 1.1 What is the Digital Signature
- 1.2 Why the Digital Signature is Important
- 2 How the Digital Signature Works
- 2.1 What is a Hash Function
- 2.2 What is Asymmetric Key Cryptography
- 2.3 Signing a Digital Document
- 2.4 Verifying the Digital Signature of a Document
- 2.5 A Concrete Example
- 3 Authenticity Verification
- 3.1 Where to Find the Public Key
- 3.2 The Problem of Authenticity
- 3.3 Verifying and Signing Digital Certificates
- 3.4 Introducers and Certification Authorities (CAs)
- 4 The Hierarchical (X.509) Authentication Model
- 5 The Web-Of-Trust (OpenPGP) Authentication Model
- 5.1 Self-Signing Your Own Digital Certificate
- 5.2 Verifying and Signing Certificates of the Others
- 5.3 Deciding Whom To Trust
- 5.4 Deciding About the Validity/Authenticity of a Certificate
- 5.5 Calculating the Validity/Authenticity of a Certificate
- 5.6 Digital Notaries
- 6 Comparing the X.509 and OpenPGP Authentication Models
- 6.1 Inflexible vs Flexible and Versatile
- 6.2 Centralized vs Decentralized and Distributed
- 6.3 Vulnerable vs Robust and Reliable
- 7 Conclusion
- 8 Bibliography
1 Introduction
1.1 What is the Digital Signature
Digital Signature means some digital data that are attached to a digital document, which cannot be falsified, and which guaranty the integrity and the authenticity of the document.
The "integrity" means that the document has not been changed/corrupted since the time that it was digitally signed, either intentionally or by mistake. The "authenticity" means that we can verify and be sure about the person that signed the document (the author of the document).
So, the Digital Signature on digital documents has the same purpose as the hand- written signature on the hard-copy (printed) documents.
The digital signature is not related to a scanned version of a hard-copy document which is hand-signed, or to a faxed paper document which is hand-signed, or to a scanned image of the hand-signature which is included to a document or attached to an email. All these methods seem to be intuitive for a digital signature, however they do not fully guaranty the integrity and the authenticity of a document, therefore cannot be used as valid digital signatures.
1.2 Why the Digital Signature is Important
Without being able to sign digital documents, they can never be considered official, because we cannot be sure that they are original and we cannot be sure who is the real author (they can be corrupted and manipulated). So, despite using computers, digital systems and digital documents, we always have to rely on the hard copies of the documents and keep them around for official purposes, since we can't fully trust the digital documents. This means that we will never be able to build totally digital systems for institutions and organizations, free from papers and hard-copy documents. For example if a citizen has to interact with a governmental institution and sends some documents online, nevertheless he still has to submit the hard copies of the documents in person, since we can't rely on the authorship and correctness of the digital documents.
Or if a person sends a document by email to a bank, the bank cannot rely on it, since it cannot be fully sure about the authenticity (real authorship) of the document, that it is not manipulated or corrupted somehow, or that it is not just a trick or deception.
Only the Digital Signature can guaranty the identity of the author of a document and establish a secure relationship between the people and the digital documents. So, it is an essential tool for enabling/supporting the digital identity, for establishing trust and security on the digital world, and for building a digital society (digital governance, digital business, etc.).
2 How the Digital Signature Works
The Digital Signature is based on the hash functions and the so called asymmetric key cryptography (private/public key pairs).
2.1 What is a Hash Function
The job of a hash function is to digest (process) an electronic document and to generate from it an extract. No matter how big is the document, the extract has always the same fixed size. Two different documents cannot produce the same extract. A document that is changed even by a single character will produce a different extract after being digested by the hash function.
2.2 What is Asymmetric Key Cryptography
Different from the symmetric cryptography, which uses the same key for both encryption and decryption, asymmetric cryptography uses one key for encryption and a different one for decryption. Each person has a pair of encryption keys; one of them is private (secret) and is known only by the person himself, and the other key is public and is known by everybody. A message that is encrypted by one of the keys, can be decrypted only by the other key of the pair. It is almost impossible to find the private key from the public key that is known by everybody.
The algorithms that are used for generating a pair of private/public keys and for encrypting and decrypting a message are based on the arithmetic of large prime numbers and calculations with residue classes. It is not difficult to understand them, however this is not the proper place to explain such details. It is enough to know that the asymmetric key cryptography is thought to be quite secure and unbreakable.
2.3 Signing a Digital Document
To sign a digital document, these steps are followed:
- The digital document is digested by the hash function and a digital extract is produced.
- The digital extract of the document is encrypted with the private key of the author.
- This encrypted extract of the document is the digital signature and it can either be appended to the original document, or can be saved as a separate file.
2.4 Verifying the Digital Signature of a Document
The verification of the digital signature of a document is done like this:
- The digital document is digested by the hash function and its digital extract is produced.
- The digital signature of the document is decrypted with the public key of the author. This gives us the digital extract of the original document.
- The digital extract of the current document (from the first step) and the digital extract of the original document (from the second step) are compared. If they are the same, then the signature is good and the document is original. Otherwise the signature is bad and the authenticity of the document cannot be guarantied (most probably it has been corrupted, intentionally or by error).
If these digital extracts are not equal, either the content of the document has been corrupted (by error or intentionally), or the author of the document is not the one who claims to be, or both. Any of these reasons is enough to discard the document as invalid.
2.5 A Concrete Example
Email is a kind of digital document, and it can be signed digitally. Actually it is the document that is most widely used with a digital signature nowadays. This is probably due to the fact that the Internet of today is not secure, and emails can be faked easily, and one cannot be completely sure about its authenticity, unless it is digitally signed.
Suppose that Alice sends an email to Bob. She signs this email using her private key. Then Bob verifies the signature using the public key of Alice. If the verification is successful, then Bob can be sure that this email cannot have been signed except with the private key of Alice. Since only Alice has her private key, then only she can be the signer (and hence the author) of the message.
3 Authenticity Verification
3.1 Where to Find the Public Key
Consider again the example of the last section, where Alice sends an email to Bob. Where can Bob find the public key of Alice, so that he can verify the authenticity of her message?
There are several ways that Bob can get her public key. Maybe Alice gave it to him directly, using a removable media or sending it as an attachment. Maybe Alice published it on her website and Bob got it from there. Maybe Alice published it on some public key server and Bob retrieved it from there (and this is the most common case in practice).
A Public Key Server (PKS) is like a directory server (a dictionary), where you can look up and retrieve the public key of a given person. Alice can upload her public key on a PKS, and Bob (or anyone else that needs to verify her signature) can look up and retrieve this key from there.
Actually the public key of a person is stored in a digital document that contains also the identity of a person (name, email, address, organization, etc.). This document is called Digital Certificate (or Identity Certificate, or Public Key Certificate). It is the Digital Certificate that is uploaded to a PKS and retrieved from it, and it is the Digital Certificate that makes the relation (connection) between the digital identity of a person and his public key.
3.2 The Problem of Authenticity
Here we are faced with a problem. If Bob retrieved the digital certificate of Alice from a PKS, how can he be sure that it is authentic? How can he be sure that this certificate was uploaded there by Alice and the public key in it really belongs to Alice? Probably somebody else uploaded that digital certificate there instead of Alice, with the identity of Alice but with a different public key.
This is actually a social problem, not a technical one, and it can be solved by social means. Bob can actually call Alice and make sure that the ID of her key is the same as the one that he got from the PKS. Or probably Alice gave Bob a business card where she has also written the ID of her public key, so Bob can check this ID with the ID of the key that he retrieved from the PKS and make sure that it is correct.
However most of the time we communicate with people that we have never met before and we have no idea who they are. It can be that "Alice" is just a fake identity (a nickname or a fake name, not the name of a real person). Or maybe somebody else uploaded the certificate instead of Alice, pretending to be Alice, and the key in the certificate does not really belong to Alice (is a fake public key).
If Bob has never met Alice, then how can he be sure about her real identity? How can he be sure that Alice is a real person and that the messages that he gets are really coming from her and not from somebody pretending to be her? In other words, how can Bob be sure that the digital certificate of Alice, that he gets from the PKS, is authentic?
Just verifying that the signature of the message is correct is not enough. We need to verify also that the digital certificate that was used for the signature is authentic.
Again, this is a social problem and cannot be solved only by technical means. It can be solved only by a combination of social and technical procedures.
3.3 Verifying and Signing Digital Certificates
Suppose that Chloe has checked the digital certificate of Alice and is sure that:
- Alice is a real person and the digital identity on her digital certificate corresponds to her real-life identity and is correct.
- The public key in the digital certificate is the correct one (the one that belongs to Alice).
Now that Chloe has verified that the digital certificate of Alice is authentic, she can sign it. A digital certificate is just a digital document, so it can be signed with a digital signature.
By signing the digital certificate of Alice, Chloe testifies that it is correct and valid, which means that the digital identity is authentic and the public key really belongs to Alice. The digital signature of Chloe also guaranties that the information on the digital certificate has not been changed since the time that she verified and signed it.
3.4 Introducers and Certification Authorities (CAs)
If Bob has full trust on Chloe about checking and verifying the information of digital certificates, then he can be sure that the digital certificate of Alice is authentic and valid, without having to check and verify it himself.
So, Bob asserts (derives) the validity/authenticity of the digital certificate of Alice by trusting a third party, which is Chloe. Bob can trust as well any other digital certificates that Chloe has signed. In such a case Chloe is called an "introducer" for Bob.
If Chloe verifies and signs a lot of digital certificates and a lot of people trust the certificates signed by Chloe, then Chloe is called a Certification Authority (CA).
4 The Hierarchical (X.509) Authentication Model
The X.509 authentication model is a hierarchical one. The digital certificate of a person is verified and signed by a certification authority (CA), the digital certificate of this CA is verified and signed by a higher level CA, and so on until we reach a root CA, whose digital certificate is self-signed (has signed himself his own digital certificate).
For example, if Bob receives a message signed with the digital certificate of Alice, he will notice that this digital certificate is verified and signed by CA1, which in turn is verified and signed by CA2, which is verified and signed by RCA (a root CA). Bob just has to check that the certificate of the root CA is correct (valid and authentic), and then he has to trust that each of RCA, CA1 and CA2 have done the verification and signing properly. He doesn't have to check and verify the certificate of Alice directly. This chain verification is usually done automatically by the software that Bob uses.
The digital certificate (public key) of the root authority has to be widely known and easily verifiable. And also Bob has to trust it (actually it turns out that Bob does not have much choice on this, because other people have decided that Bob should trust it). The validation of a certificate is based on the trust that Bob has that the root CA and each of the CAs have done their job properly (checking and verifying the certificates of the next level).
CAs are usually commercial, but large institutions or government entities may have their own CAs as well. There are about 50 root CAs that are known worldwide.
5 The Web-Of-Trust (OpenPGP) Authentication Model
The OpenPGP standard uses a non-hierarchical, decentralized authentication model that is called Web-Of-Trust.
5.1 Self-Signing Your Own Digital Certificate
In the OpenPGP model each person acts as a root CA and first of all self-signs his own digital certificate (to protect it from any modification and forgery). For example Alice signs her own certificate and Bob signs his own.
5.2 Verifying and Signing Certificates of the Others
Second, each of them can sign the certificates of the people, which they have personally checked and verified. Verification includes both making sure that the digital identity matches the real-life identity of the person, and making sure that the public key in the certificate is the correct one that belongs to this person. This certificate verification and signing can be mutual as well, for example Alice signs the certificate of Bob, and Bob signs the certificate of Alice.
When Alice signs the certificate of Bob, usually she makes public this signature by uploading the signed certificate on a PKS. This lets everybody know that she has checked and verified the digital certificate of Bob and that she guaranties that this certificate is authentic and valid.
5.3 Deciding Whom To Trust
Next, each person decides who are the people that he can trust about making correct verification of others' certificates, and how much he can trust them. The trust levels that are defined by the OpenPGP standard are: unknown (default), none, marginal, full, ultimate. These trust values are not about how trustable is this person in the real life, but rather about the ability of the person to make correct verification of digital certificates, before signing them.
For example the trust value marginal means that you believe that this person sometimes may not check and verify carefully the details of a certificate, before signing it. The trust value full means that you believe that this person is very careful when signing certificates. The trust value ultimate means that you believe that this person is so careful when checking and signing certificates, that he almost never makes mistakes.
The trust level that one assigns to a person is subjective and can be different from one person to another. For example Alice may have full trust on Bob, however Chloe may think that Bob can be trusted only marginally. The trust level is also private, which means that it is relevant only to the person who assigns it, and it is not published on any servers.
5.4 Deciding About the Validity/Authenticity of a Certificate
The figure shows a web of trust rooted at Alice. The graph illustrates who has signed who's certificate.
Alice is sure that the certificates of Blake and Dharma are valid, since she has verified and signed them herself.
If Alice has full trust on Dharma, then she would consider valid the certificates of Chloe and Fransis as well. She has not verified them herself, but Dharma has verified and signed them and Alice has full trust on the ability of Dharma to correctly verify and sign digital certificates.
In case that Alice has only marginal trust on Blake and Dharma, then she cannot be really sure about the validity of the Francis' certificate, although Dharma has signed it. However, she can be almost sure about the validity of the Chloe's certificate. Both Blake and Dharma have verified and signed it, so the possibility of both of them being deceived (or corrupted, mistaken) is small.
Alice is sure that the certificates of Blake and Dharma are valid, since she has verified and signed them herself.
If Alice has full trust on Dharma, then she would consider valid the certificates of Chloe and Fransis as well. She has not verified them herself, but Dharma has verified and signed them and Alice has full trust on the ability of Dharma to correctly verify and sign digital certificates.
In case that Alice has only marginal trust on Blake and Dharma, then she cannot be really sure about the validity of the Francis' certificate, although Dharma has signed it. However, she can be almost sure about the validity of the Chloe's certificate. Both Blake and Dharma have verified and signed it, so the possibility of both of them being deceived (or corrupted, mistaken) is small.
5.5 Calculating the Validity/Authenticity of a Certificate
The decision on which certificate can be considered fully valid, or partially valid, or non-valid, is actually done automatically by the software that is used for verifying the signature. The software makes this decision based on who has signed who's certificate, on the trust value assigned to each of the people on the web of trust, and applying certain rules that are used to calculate the validity (authenticity) of a certificate. Such a rule can be for example: a certificate that is signed by at least three marginally trusted people can be considered fully valid.
The validation rules are customizable and can be different for each person, in order to fit the security requirements of everybody. For example, if Alice does not have any high security needs, and she lives in a friendly (not hostile) environment, then she may decide that even two marginally trusted signatures are enough to consider a certificate fully valid. However, if she has high security requirements and she lives in a rather hostile environment, then she can decide that at least five marginally trusted signatures should be required, so that a certificate that she has not verified herself can be considered valid. In this case, since she has decided to depend less on the verifications done by the others, she has to do more verifications on her own.
5.6 Digital Notaries
Sometimes there are people who do a great many of verification and signing of the others' digital certificates, even on a full time bases, and they are trusted by everybody (or at least a lot of people). These people play the role of a CA (Certification Authority) in the OpenPGP model.
Such people can be for example the head of the IT department in a company or institution. Or they can be people approved, verified and authorized by the government to offer this kind of service to the citizens. In this case they can also be called Digital Notaries and they may offer other Digital Services as well, besides verifying and signing digital certificates.
The Digital Notaries can also be held responsible in front of law for the correctness and truthfulness of the verifications and signatures that they make (as well as for other digital or non-digital services that they may offer). This accountability can be very useful for increasing their responsibility, as well as for increasing the trust of people on them and the health and reliability of the web-of-trust system as a whole.
6 Comparing the X.509 and OpenPGP Authentication Models
The digital certificates of both standards, X.509 and OpenPGP, are very similar in content and they are based on the same principles (of asymmetric cryptography, private/public key pair, etc.). However their authentication models are different and not interoperable. This means that a digital certificate that is recognized as valid and authentic by one of them, can not be recognized as such by the other.
However both of them can be used concurrently (at the same time) without interfering with each-other. This means that a person can have one certificate of type X.509 and another of type OpenPGP at the same time, and use either one of them or the other, as needed. This is also facilitated by the fact that most of the software that are used for digital signatures support both of these standards.
6.1 Inflexible vs Flexible and Versatile
If we compare the structure of the authentication models of the X.509 and the OpenPGP standards, we will notice that the first one closely resembles a tree (is hierarchical, like the structure of the private/governmental organizations), while the second one resembles a web or mesh (like the structure of the Internet).
A mesh is a much more flexible structure than a tree, because a tree structure is just a special case of a mesh structure.
In the web-of-trust authentication model of OpenPGP there can be CAs as well (like in the case of Digital Notaries that we have discussed previously). If many people choose to fully trust the same CA for checking the validity/authenticity of the others certificates (and they all configure their own copies of the OpenPGP client software to trust that CA), then the OpenPGP model acts just like the X.509 model. In fact, the web model of OpenPGP is a proper superset of the hierarchical model of X.509.
There is no situation in the X.509 model that cannot be handled exactly the same way in the OpenPGP model. But OpenPGP can do much more.
In the X.509 model the set of trusted root CAs is fixed and predetermined. The users have no choice and can make no decision whether to trust them or not. This is so true that these CAs are "baked into" the major software that uses digital certificates (e.g. browsers). On the OpenPGP model, on the other hand, the users can decide themselves whom to trust and how much to trust them.
6.2 Centralized vs Decentralized and Distributed
We can notice as well that the hierarchical model is centralized, while the web-of-trust model is distributed and decentralized. This is related to who is responsible for ensuring the correctness, authenticity and validity of the certificates, the security, trustability and reliability of the whole system, etc.
On the hierarchical model this responsibility falls on some central authority (the root CA), and on the sub-authorities (CAs) that it approves. On the web-of-trust model this responsibility falls on everybody participating on the system, since each of them helps to verify and validate the certificates of the others. So, on the web-of-trust model, each person that holds a digital certificate is verified by the others and helps to verify the others at the same time. This is a more democratic model, that encourages the responsibility and the participation of the citizens.
6.3 Vulnerable vs Robust and Reliable
The decentralized/distributed model is also more robust and reliable than the hierarchical model.
The hierarchical model has just a single point-of-failure that has to be watched, protected and guarded very carefully, since it is a clear target of attack. This is the root CA. In case that its security is compromised or corrupted some day, then the security of the whole system is compromised and all of the digital certificates of the system are rendered invalid.
This doesn't have to be a technical failure (for example some hackers breaking into the system), it can be a social corruption as well (and this can be even more likely than a technical failure). This risk is amplified by the fact that most of the CAs are commercial. Matt Blaze once made the cogent observation that commercial CAs will protect you against anyone who that CA refuses to accept money from!
The distributed model, on the other hand, is much more difficult to corrupt because each participant is a little CA on its own. Maybe some of them can be corrupted for some time, but it is quite difficult to corrupt many or most of them at the same time. In any case, there can be inflicted only local damages, the whole system will survive the attack, and with time it can auto-correct and heal itself gradually.
7 Conclusion
It is quite easy to understand the concept of Digital Signatures and the basics of how it works. The Digital Signature is so important that it will become an inevitable part of our future digital societies.
A very important aspect of the digital signature is verification of its authenticity. It happens that this is more a social problem than a technical one, so it can be solved correctly only by the right combination of social and technical means.
Currently, there are two models (or infrastructures) for solving the authentication problem. One of them is the Hierarchical model (X.509 standard), and the other one is the Web-Of-Trust model (OpenPGP standard). The Web-Of-Trust model is more flexible and advanced than the Hierarchical model, but it requires that everybody that participates in it takes responsibility and makes decisions for himself.
However I think that the Web-Of-Trust is the right approach, because the personal privacy and security are, by definition, personal responsibilities, and they cannot be outsourced.
8 Bibliography
- http://en.wikipedia.org/wiki/Digital\_signature
- http://en.wikipedia.org/wiki/Public\_key
- http://en.wikipedia.org/wiki/Digital\_certificate
- http://en.wikipedia.org/wiki/X.509
- http://en.wikipedia.org/wiki/Web\_of\_trust
- http://www.youdzone.com/signature.html
- http://www.gnupg.org/gph/en/manual.html
- http://www.cryptnet.net/fdp/crypto/keysigning\_party/en/keysigning\_party.html
- http://www.openpgp.org/technical/whybetter.shtml
- http://enigmail.mozdev.org/
- http://www.gpg4win.org/
Author: Dashamir Hoxha <dashohoxha@gmail.com>
Date: 20 Sep, 2011
HTML generated by org-mode 6.33x in emacs 23
Virtual Machines on a CentOS Host
A powerful rack server can be used as a host for installing lots of virtual machines, and it can be used as a data storage as well. This article will describe how to use such a server, installed with CentOS, as a host for virtual machines.
The server is a SuperMicro X9DRL-3F/IF with these parameters:
RAM | 32GB |
HDD | 2x320GB + 2x3TB |
Network | 2 Gb interfaces + 1 KVM/IMPI interface |
The first two disks (2x320GB) are used in RAID1 configuration to keep the host and the virtual servers, and the 2x3TB disks are used for the data.
Table of Contents
- Installation of CentOS
- Disk partitioning and formating
- Managing partitions with LVM:
- Creating bridged interfaces on CentOS
- Installing KVM and libvirt
1. Installation of CentOS
During boot-up, we press Ctrl+I and configure a RAID1 device with the first two disks (320Gb each).
Installation was done with CentOS-6.2-x86_64-minimal.iso (standard installation, where the installer automatically partitions the first disk drive (the raid one)). The standard installation of CentOS is very easy (with a GUI interface), it use RAID automatically, partitions the disk automatically, and uses LVM for the partitions.
Installation was done with CentOS-6.2-x86_64-minimal.iso (standard installation, where the installer automatically partitions the first disk drive (the raid one)). The standard installation of CentOS is very easy (with a GUI interface), it use RAID automatically, partitions the disk automatically, and uses LVM for the partitions.
2. Disk partitioning and formating
We have two disks of 3TB each that we need to partition and manage, however the partition tables of type msdos (most commonly used) cannot manage more than 2TB of disk space. The solution is to use partion tables of type GPT. Fortunately, the partition editor (parted) of Linux, supports them quite well.
The GPT partitions can also have more than 4 primary partitions, so there is no need for extended partitions and tricks like this. I split both of the disks in 6primary partitions of 500GB each, formating them with ext4. Later, I am going to manage these partitions with LVM.
The steps below show roughly how I did the partitioning:
The GPT partitions can also have more than 4 primary partitions, so there is no need for extended partitions and tricks like this. I split both of the disks in 6primary partitions of 500GB each, formating them with ext4. Later, I am going to manage these partitions with LVM.
The steps below show roughly how I did the partitioning:
yum install parted
parted /dev/sdc
parted /dev/sdd
(parted) printAnd this is how I formated them with ext4:
(parted) mklabel gpt
(parted) mkpart primary 0.0GB 500.0GB
(parted) mkpart primary 500.0GB 1.0TB
(parted) mkpart primary 1.0TB 1.5TB
(parted) mkpart primary 1.5TB 2.0TB
(parted) mkpart primary 2.0TB 2.5TB
(parted) mkpart primary 2.5TB 3.0TB
(parted) print
Model: ATA ST3000DM001-9YN1 (scsi)
Disk /dev/sdd: 3001GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Number Start End Size File system Name Flags
1 1049kB 500GB 500GB primary
2 500GB 1000GB 500GB primary
3 1000GB 1500GB 500GB primary
4 1500GB 2000GB 500GB primary
5 2000GB 2500GB 500GB primary
6 2500GB 3001GB 501GB primary
(parted)
mkfs.ext4 /dev/sdc1Referencies:
mkfs.ext4 /dev/sdc2
mkfs.ext4 /dev/sdc3
mkfs.ext4 /dev/sdc4
mkfs.ext4 /dev/sdc5
mkfs.ext4 /dev/sdc6
for i in 1 2 3 4 5 6; do mkfs.ext4 /dev/sdd$i ; done
3. Managing partitions with LVM
One of the best advantages of Logical Volume Management (LVM) is the flexibility. LVM disks and partitions can be resized easily, when needed. Actually, in the terminology of LVM, logical disks are called Volume Groups (VG), and logical partitions are called Logical Volumes (LV). We can create several VGs, and inside each of them we can create LVs. The sizes of VGs and LVs are flexible, we can extend them later, if needed.
Let's create a volume group named vg_data by including some Physical Volumes (PV, physical disk partitions) to it:
Let's create a volume group named vg_data by including some Physical Volumes (PV, physical disk partitions) to it:
vgdisplayThen we can extend it by adding some more PVs (partitions) to it:
vgcreate vg_data /dev/sdc1 /dev/sdc2 /dev/sdd1 /dev/sdd2
vgdisplay
vgdisplayNow, inside the VG named vg_data, let's create an LV (logical partition) named /dev/vg_data/lv_mirror, of size 1TB:
vgextend vg_data /dev/sdc3 /dev/sdc4 /dev/sdd3 /dev/sdd4
vgdisplay
lvdisplayWe can create an ext4 filesystem on it like this:
lvcreate vg_data -L 1T -n /dev/vg_data/lv_mirror
lvdisplay
mkfs.ext4 -L mirror /dev/vg_data/lv_mirrorAnother LV can be created like this:
lvcreate vg_data -L 500G -n /dev/vg_data/lv_cache
lvdisplay
4. Creating bridged interfaces on CentOS
We want the virtual machines to be connected directly to the network, and for this reason we should create bridged interfaces on the host system. We create a bridged interface for each of the network interfaces of the server. The steps below show how it can be done on CentOS.
- Edit
/etc/sysconfig/network-scripts/ifcfg-eth0
:DEVICE="eth0"
HWADDR="00:25:90:76:92:AA"
ONBOOT="yes"
BRIDGE="br0" - Edit
/etc/sysconfig/network-scripts/ifcfg-eth1
:DEVICE="eth1"
HWADDR="00:25:90:76:92:AB"
ONBOOT="yes"
BRIDGE="br1" - Edit
/etc/sysconfig/network-scripts/ifcfg-br0
:DEVICE="br0"
TYPE="Bridge"
BOOTPROTO="static"
ONBOOT="yes"
IPADDR="192.168.100.254"
NETMASK="255.255.255.0"
DELAY="0" - Edit
/etc/sysconfig/network-scripts/ifcfg-br1
:DEVICE="br1"
TYPE="Bridge"
BOOTPROTO="static"
ONBOOT="yes"
IPADDR="192.168.10.254"
NETMASK="255.255.255.0"
DELAY="0"
GATEWAY="192.168.10.1" - Restart the network:
service network restart
5. Installing KVM and libvirt
- First check if the CPU supports hardware virtualization:
egrep '(vmx|svm)' --color=always /proc/cpuinfo
- Install kvm and libvirt:
rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY*
yum install kvm libvirt python-virtinst qemu-kvm - Modify
/etc/libvirt/libvirtd.conf
and uncommentmdns_adv = 0
. Then restart libvirtd and check it with virsh:service libvirtd restart
virsh -c qemu:///system list - Add a user that can manage the virtual machines:
useradd virtadmin
We would like to be able to manage the virtual machines remotely (for example with virt-manager), and it is not a good idea to use the root account for doing it. So we create another account, virtadmin, that has permissions to manage the virtual machines. These permissions are assigned to it simply by adding it to the group kvm.
passwd virtadmin
usermod -a -G kvm virtadmin - Set
SELINUX=disabled
on/etc/selinux/config
and then reboot:# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted - For easy backup, we keep all the configurations and images on a separate directory, called
/systems
(which can also be on a separate partition). Move all the configurations and settings to/systems
, like this:mkdir /systems
Modify
mv /etc/libvirt /systems/etc
ln -s /systems/etc /etc/libvirt
mv /var/lib/libvirt/ /systems/var
ln -s /systems/var/ /var/lib/libvirt
mkdir /systems/images//systems/etc/storage/default.xml
like this:/systems/images
Powered by Blogger.