最近重新看了下分布式日志系统Scribe。之前我在Ubuntu 9.10很顺利装上过,现在换到Ubuntu 10.04,Ruby手工升级到1.9.3后,怎么也装不上Thrift(最新版本0.9)。卡在这个地方:
Successfully built RubyGem Name: thrift Version: 0.9.0.1 File: thrift-0.9.0.1.gem gem install thrift-*.gem ERROR: Could not find a valid gem 'thrift-*.gem' (>= 0) in any repository rake aborted!
gems的各个公开repo里,还没有thrift-0.9.0.1这个gem版本。但是thrift-0.9自带的源码里有这个gem,手工用gem install –locale thrift-0.9.0.1.gem安装上,再运行make install安装Thrift还是不行。想想算了,之前安装的Thrift和Scribe还可以用,先用着再说。
如下是引用别人的一篇Scribe安装文档:
Scribe is a log aggregator, developed at Facebook and released as open source. Scribe is built on Thrift, a cross-language RPC type platform, and therefore it is possible to use scribe with any of the Thrift-supported languages. Whilst Perl is one of the supported languages, there is little in the way of working examples, so here’s how I did it:
1. Install Thrift.
2. Build and install FB303 perl modules
cd thrift/contrib/fb303 # Edit if/fb303.thrift and add the line 'namespace perl Facebook.FB303' after the other namespace declarations thrift --gen perl if/fb303.thrift sudo cp -a gen-perl/ /usr/local/lib/perl5/site_perl/5.10.0 # or wherever you keep your site perl
This creates the modules Facebook::FB303::Constants, Facebook::FB303::FacebookService and Facebook::FB303::Types.
3. Install Scribe.
4. Build and install Scribe perl modules
cd scribe # Edit if/scribe.thrift and add 'namespace perl Scribe.Thrift' after the other namespace declarations thrift -I /path/to/thrift/contrib/ --gen perl scribe.thrift sudo cp -a gen-perl/Scribe /usr/local/lib/perl5/site_perl/5.10.0/ # or wherever
This creates the modules Scribe::Thrift::Constants, Scribe::Thrift::scribe, Scribe::Thrift::Types.
Here is an example program that uses the client (reading one line at a time from stdin and sending to a scribe instance running locally on port 1465):
#! /usr/bin/perl use Scribe::Thrift::scribe; use Thrift::Socket; use Thrift::FramedTransport; use Thrift::BinaryProtocol; use strict; use warnings; my $host = 'localhost'; my $port = 1465; my $cat = $ARGV[0] || 'test'; my $socket = Thrift::Socket->new($host, $port); my $transport = Thrift::FramedTransport->new($socket); my $proto = Thrift::BinaryProtocol->new($transport); my $client = Scribe::Thrift::scribeClient->new($proto, $proto); my $le = Scribe::Thrift::LogEntry->new({ category => $cat }); $transport->open(); while (my $line = <>) { $le->message($line); my $result = $client->Log([ $le ]); if ($result == Scribe::Thrift::ResultCode::TRY_LATER) { print STDERR "TRY_LATER\n"; } elsif ($result != Scribe::Thrift::ResultCode::OK) { print STDERR "Unknown result code: $result\n"; } } $transport->close();
上述文档的作者,写了一个Perl模块Log::Dispatch::Scribe,在这个模块基础上写了一个scribe_cat.pl脚本,可以通过配置,把web服务器的日志通过管道发送给该脚本,再转发到scribed服务器。例如Apache日志的配置如下:
CustomLog "|/usr/local/bin/scribe_cat.pl --category=apache" combined
Facebook自己如何使用scribe的?看下sourceforge论坛上FB员工的留言:
At Facebook, we just log all Apache errors locally to a file. Each machine is also running Scribe locally with Scribe configured to forward messages to a central location. We then run a simple script on each machine that tails the Apache log and writes the data to Scribe.
I wanted to know how do u keep tailing ? cause after a specific file-size or interval, the *_current sym link changes to a different file.
The unix command “tail –follow=name” current should do what you want. It will follow the current symlink and reopen the new log file when it gets rotated.
We use Ruby’s File::Tail (http://file-tail.rubyforge.org/doc/classes/File/Tail.html)
Perl also has a similar CPAN module-> http://search.cpan.org/dist/File-Tail/
There is also “logtail.c” which will tail the file and also keep a history of the last position that it read ->http://www.drxyzzy.org/ntlog/logtail.c