Apache achieves Leet-ness

Was troubleshooting a coding issue for a customer the other day when I noticed the number of requests currently being processed. I am going to write about this in the coming days. Basically there was a code issue (we do not support code for this customer) that caused a large number of “sleep” states in MySQL as well as a huge number of ‘W Sending Reply’ states in Apache.

Current Time: Thursday, 11-Feb-2010 11:47:54 CST
Restart Time: Thursday, 11-Feb-2010 09:17:12 CST
Parent Server Generation: 0
Server uptime: 2 hours 30 minutes 41 seconds
Total accesses: 226733 – Total Traffic: 10.8 GB
CPU Usage: u1812.56 s80.08 cu.04 cs0 – 20.9% CPU load
25.1 requests/sec – 1.2 MB/second – 49.8 kB/request
1337 requests currently being processed, 113 idle workers

5 Minute Disk I/O in KB

OS: CentOS 5.4
Arch : x86_64
Version : 0.1a
Required Packages : netsnmpd, sysstat
snmpd conf addition:
exec .1.3.6.1.4.1.2021.40 SARDISKIO /usr/bin/mrtg_diskio
Client Script Used:
<script>
#!/bin/bash
#V 0.1b MisterX Dec 9th 2009
# Replace md0-md3 with the drives you want to watch
#OUTPUT example for first Drive :
#1.3.6.1.4.1.2021.40.101.1 tps
#1.3.6.1.4.1.2021.40.101.2 kB_read/s
#1.3.6.1.4.1.2021.40.1.1.3 kB_wrtn/s
#1.3.6.1.4.1.2021.40.101.4 kB_read
#1.3.6.1.4.1.2021.40.101.5 kB_wrtn
# and so on for each additional Drive. A snmpwalk -v2c -On -c $community $host 1.3.6.1.4.1.2021.40.101 will show
#you the full list for all drives.
for d in md0 md1 md2 md3; do

/usr/bin/iostat -dk | grep $d | awk '{print $2}';
/usr/bin/iostat -dk | grep $d | awk '{print $3}';
/usr/bin/iostat -dk | grep $d | awk '{print $4}';
/usr/bin/iostat -dk | grep $d | awk '{print $5}';
/usr/bin/iostat -dk | grep $d | awk '{print $6}';

done
</script>

MRTG Code :

<mrtg_config>
Target[$server_name-disk]: 1.3.6.1.4.1.2021.40.101.4&1.3.6.1.4.1.2021.40.101.5:$community@$remote_server
Title[$server_name-disk]:  Disk $drive 5 min Average I/O Utilization
MaxBytes[$server_name-disk]: 10240000000000000
PageTop[$server_name-disk]: <H1>5 min Avg. I/O Utilization Report</H1>
kmg[$server_name-disk]: KB,MB,GB
LegendI[$server_name-disk]: 5 min Avg. I/O KBread
LegendO[$server_name-disk]: 5 min Avg. I/O KBwrite
Legend1[$server_name-disk]: 5 min Avg. I/O KBread
Legend2[$server_name-disk]: 5 min Avg. I/O KBwrite
YLegend[$server_name-disk]: Kilobytes
ShortLegend[$server_name-disk]: &
Options[$server_name-disk]: growright,nopercent
</mrtg_config>

This is just a average to get a overview and will never replace good administration.
Its only to track trends and general issues but if you use MRTG you probably have other tools you use and already know this ;)

Debugging Apache segfault with strace

OS: CentOS 4.8

Apache : Custom RPM from source with only a single change to the location of the suexec directory

strace -t -f -v  -p $process -o /path/to/outputfile (note the $process is the primary Apache Process)

To find the main Apache Process you do a :

ps -ef | grep httpd

and it returns something like this :

apache   26898 22378  8 13:50 ?        00:00:01 /usr/sbin/httpd -k start

the second number 22378 is the PID of the Apache parent process.

I then waited for a :

Dec 11 10:02:20 web02 kernel: httpd[7121]: segfault at 0000007fbf3fff0c rip 0000002a9567344a rsp 0000007fbf3ffe90 error 6

in my /var/log/messages.

Once that came I did a:

grep SIGSEGV /path/to/file_generated_w/strace

and noted times and PIDs. Here is a example output :

19730 12:07:35 — SIGSEGV (Segmentation fault) @ 0 (0) —

19784 12:08:56 — SIGSEGV (Segmentation fault) @ 0 (0) —

I then grepped out the PIDs (19784 and 19730 in the above example) with a segfault to different files and began reading. To grep this out I did :

grep 19730 /path/to/file_generated_w/strace > /tmp/out.19730

It was in these files I found my problem. Your mileage may vary but I found this method much easier than using the Apache config setting of CoreDumpDirectory which requires several changes that have to be undone. The CoreDumpDirectory setting also requires a few restarts of Apache which in a production environment can be undesired.

The main caveat to using strace is that , on a busy server, you can generate 100-300M of logs per minute so make sure you have the diskspace on the partition you are sending out strace output.

Linux or Windows

To throw my .02$ into this fray I say:

Linux is for work and Windows is for play :)

But seriously I have been amazing at how many free tools Linux includes with the OS (or via the net/repos) for free.

Then there is the issue of the constantly evolving Windows “shell” which keeps changing commands almost every major release of the OS. With Linux I can run basically the same command,with little or no changes, that I ran almost 15 years ago when I started playing around the OS. If Windows did this I would have much more respect for them and their OS.

10 things to get rid of spam

Just a few thoughts I had while battling the endless waves of canned meat I fight daily. Some of these are out of anger, some are out of frustration, and  some are there just for fun.  I will let you, the end user, determine which one is for what reason.

In no particular order:

  1. Get rid of free email providers.
  2. Make all new domains registered setup and understand SPF/DKIM or similar DNS based technology
  3. Make email cost more per box with reasonable limits. I mean seriously who needs 500 boxes for a company with 10 people. A little extra cost can go a long way.
  4. Tougher Laws on people that send UCE (Unsolicited commercial e-mail). A little can go a long way here as well.
  5. Make violators found guilty of sending UCE  watch endless internet phenomenon based movies for weeks at a time. Think the “Dramatic Chipmunk” and all its variants for a week flowing into the spammers brain Clockwork Orange style.
  6. Make ISPs stop tell the truth when telling its customers what their spam filter policy really is. Mark it as spam and the person you think isnt going to find out might just know you marked their message about family holiday cheer as spam.
  7. Catch-All accounts need to go away.Their day ended almost back in the days of the ARPA net.  If we can end spam maybe these will be ok but till then its not a good idea, ever.
  8. Vacation/Away messages. You just told a spammer that not only is the mail box they are sending real but that the person there loves and misses them and will be back in a week.

(more soon real work calls)

Yes I am sure!

ohmy

From a single “badmail” folder on a Windows Server 2003 box. Amazing how bad script writiers are as almost all of these were related to no “From” being set in the script.

Just doing a extended test.

winzipexpired

This Winzip install is on the same fragged Server 2000 box in the last post. This screen shot was taken Jan 7th 2007 but sadly I cannot get a new shot as a overzealous engineer uninstalled winzip last week for some strange reason.

One fragged disk !

defragmeplease

What do you think ?

This was a VERY old Server 2000 install that had been forgotten about. Seems time wasnt nice to the filesystem…or maybe the Sysadmin was a sadistic BOFH who is into filesystem S&M ?!

Old Seagates never die!

While doing some spring cleaning I came across a old Seagate ST43400N 3G 50 Pin SCSI drive and decided I would go back to my roots and benchmark this monster so I could compare it with the Dell system in a previous post.

Well after a full run with the same sysbench test below I got …a whopping …. 5MB per  second ! Compared with the test I ran below its very apparent that technology moves on and drives like these become novelties very quickly.

sysbench – Xeon X5550 16G

I got the chance to finally do some benchmarking on the new X5550 Xeons. Here is what I came up with using sysbench.

System:
CPU:2x X5550 Xeons (8 cores)
RAM: 16G
Hardware Vendor (Model): Dell (R510)
OS: CentOS release 5.4 (Final)
Kernel : 2.6.18-164.6.1.el5 x86_64
HardDrive(s): OS - Raid 1  / /var/lib/mysql Raid10
Harddrive Controller: Perc6

my.cnf
innodb_log_group_home_dir=/var/log/innodb_logs
innodb_log_file_size=256M
innodb_log_files_in_group=2
innodb_buffer_pool_size=6G
innodb_additional_mem_pool_size=60M
innodb_log_buffer_size=4M
innodb_thread_concurrency=0 #As of MYSQL 5.0.19 0 makes this unlimited
innodb_file_per_table=1
innodb_flush_log_at_trx_commit=2 #Risky but not a worry for this customer due to mainly static data. 

Sysbench Command :
sysbench --test=oltp --db-driver=mysql --num-threads=16
--mysql-user=root --max-time=60 --max-requests=0 --oltp-read-only=on

Test Result (subsequent tests were withing a small percentage of this resulte)
OLTP test statistics:
    queries performed:
        read:                            4457866
        write:                           0
        other:                           636838
        total:                           5094704
    transactions:                        318419 (5306.75 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 4457866 (74294.56 per sec.)
    other operations:                    636838 (10613.51 per sec.)
Test execution summary:
 total time:                          60.0026s
 total number of events:              318419
 total time taken by event execution: 958.0739
 per-request statistics:
 min:                                  0.98ms
 avg:                                  3.01ms
 max:                                334.80ms
 approx.  95 percentile:              10.86ms

Threads fairness:
 events (avg/stddev):           19901.1875/1010.54
 execution time (avg/stddev):   59.8796/0.03
sysbench --test=fileio --max-time=60 --max-requests=1000000  --file-num=1 --file-extra-flags=direct --file-fsync-freq=0  --file-total-size=128M --file-test-mode=rndrd run
sysbench 0.4.10:  multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 1
Extra file open flags: 16384
1 files, 128Mb each
128Mb total file size
Block size 16Kb
Number of random requests for random IO: 1000000
Read/Write ratio for combined random IO test: 1.50
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random read test
Threads started!
Time limit exceeded, exiting...
Done. Operations performed:  910652 Read, 0 Write, 0 Other = 910652 Total Read 13.895Gb  Written 0b  Total transferred 13.895Gb  (237.15Mb/sec) 15177.50 Requests/sec executed