August 2012 Archives

http://wiki.prgmr.com/mediawiki/index.php/20120626_troubleshooting_ipv6  was the last one;  this one is similar.  Hughes is being rebooted, we will upgrade the kernel, and report back. 

od ahci libata raid10 shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 0, comm: swapper Not tainted 2.6.18-308.8.2.el5xen #1
RIP: e030:[<ffffffff8026389d>]  [<ffffffff8026389d>] _spin_lock+0x0/0xa
RSP: e02b:ffffffff8079dc88  EFLAGS: 00000286
RAX: ffffffff80752000 RBX: ffff88000c287280 RCX: ffff88002edbc680
RDX: 0000000000000000 RSI: ffff88000c287280 RDI: ffff8800245f3200
RBP: ffff88000c287280 R08: 0000000000000000 R09: ffffffff88674f95
[Wed Aug 22 16:45:04 2012]R10: 0000000080000000 R11: 0000000050356ec3 R12: ffff8800245f3000
R13: ffff88000f437800 R14: ffff880028c6f280 R15: ffffffff886751e6
FS:  00002b5fcc34ef50(0000) GS:ffffffff80635000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000

Call Trace:
 <IRQ>  [<ffffffff802309ff>] dev_queue_xmit+0x265/0x3ef
 [<ffffffff886751e6>] :bridge:__br_forward+0x0/0x9c
 [<ffffffff886751e6>] :bridge:__br_forward+0x0/0x9c
 [<ffffffff8867516e>] :bridge:br_dev_queue_push_xmit+0x1d9/0x200
[Wed Aug 22 16:45:04 2012] [<ffffffff886751e4>] :bridge:br_forward_finish+0x4f/0x51
 [<ffffffff8867524f>] :bridge:__br_forward+0x69/0x9c
 [<ffffffff88674d7e>] :bridge:deliver_clone+0x36/0x3d
 [<ffffffff88674da9>] :bridge:maybe_deliver+0x24/0x35
 [<ffffffff88674e20>] :bridge:br_multicast_flood+0x66/0x106
 [<ffffffff88675d41>] :bridge:br_handle_frame_finish+0x0/0x1d3
 [<ffffffff88675e21>] :bridge:br_handle_frame_finish+0xe0/0x1d3
 [<ffffffff88676099>] :bridge:br_handle_frame+0x185/0x1a4
 [<ffffffff8022134c>] netif_receive_skb+0x3a8/0x4c4
 [<ffffffff88292e67>] :igb:igb_poll+0x73e/0xb55
[Wed Aug 22 16:45:04 2012] [<ffffffff8026fcb8>] timer_interrupt+0x3d2/0x3e6
 [<ffffffff803b916c>] unmask_evtchn+0x2d/0xd9
 [<ffffffff80263929>] _spin_lock_irqsave+0x9/0x14
 [<ffffffff8020d003>] net_rx_action+0xb4/0x1c6
 [<ffffffff80212eb8>] __do_softirq+0x8d/0x13b
 [<ffffffff8025fda4>] call_softirq+0x1c/0x278
 [<ffffffff8026db89>] do_softirq+0x31/0x90
 [<ffffffff8025f8d6>] do_hypervisor_callback+0x1e/0x2c
 <EOI>  [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
 [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
[Wed Aug 22 16:45:04 2012] [<ffffffff8026efc8>] raw_safe_halt+0x87/0xab
 [<ffffffff8026c573>] xen_idle+0x38/0x4a
 [<ffffffff8024ac05>] cpu_idle+0x97/0xba
 [<ffffffff80760b11>] start_kernel+0x21f/0x224
 [<ffffffff807601e5>] _sinittext+0x1e5/0x1eb



anyhow, the machine is upgraded and it's coming back up now.

latency to hosts in sacramento

| | Comments (0)
Our hosts in Sacramento only have access through Cogent right now, and they seem to be having a network problem causing much higher latency than usual. In this traceroute:
traceroute to coloma.prgmr.com (38.99.2.75), 30 hops max, 60 byte packets
 1  gateway.prgmr.com (216.218.223.65)  0.178 ms  0.154 ms  0.156 ms
 2  gige-g6-4.core1.fmt1.he.net (216.218.217.177)  3.702 ms  4.017 ms  4.228 ms
 3  10gigabitethernet1-1.core1.pao1.he.net (184.105.213.66)  3.941 ms  4.017 ms  2.760 ms
 4  sjo-bb1-link.telia.net (213.248.86.53)  1.446 ms  1.425 ms  1.401 ms
 5  te0-7-0-2.ccr21.sjc03.atlas.cogentco.com (154.54.12.241)  1.873 ms  1.928 ms  2.595 ms
 6  te0-0-0-3.ccr21.sjc01.atlas.cogentco.com (154.54.6.105)  12.318 ms te0-1-0-3.ccr21.sjc01.atlas.cogentco.com (154.54.6.237)  12.382 ms  12.434 ms
 7  154.54.85.29 (154.54.85.29)  12.034 ms te0-2-0-1.mpd21.lax01.atlas.cogentco.com (154.54.6.29)  12.140 ms 154.54.85.25 (154.54.85.25)  12.312 ms
 8  * * *
 9  * * *
10  * * *
11  * * *
12  te2-8.ccr01.slc01.atlas.cogentco.com (154.54.82.221)  65.021 ms te4-2.ccr01.slc01.atlas.cogentco.com (154.54.0.42)  64.844 ms te2-7.ccr01.slc01.atlas.cogentco.com (154.54.82.217)  64.728 ms
13  te7-2.ccr01.boi01.atlas.cogentco.com (154.54.40.94)  64.811 ms te4-2.ccr01.boi01.atlas.cogentco.com (154.54.40.90)  65.138 ms  65.304 ms
14  te4-4.ccr01.pdx02.atlas.cogentco.com (154.54.40.97)  65.281 ms  65.241 ms te7-4.ccr01.pdx02.atlas.cogentco.com (154.54.40.101)  65.155 ms
15  te4-3.ccr01.smf01.atlas.cogentco.com (154.54.40.113)  77.795 ms  77.823 ms  77.860 ms
16  te4-2.mag01.smf01.atlas.cogentco.com (154.54.80.242)  77.774 ms te4-1.mag01.smf01.atlas.cogentco.com (154.54.80.238)  77.997 ms te4-2.mag01.smf01.atlas.cogentco.com (154.54.80.242)  77.864 ms
17  vl3506.na41.b015947-1.smf01.atlas.cogentco.com (38.20.48.58)  79.001 ms  78.975 ms  78.947 ms
18  ripple_web.demarc.cogentco.com (38.112.242.166)  78.546 ms  78.612 ms  79.173 ms
19  coloma.prgmr.com (38.99.2.75)  78.420 ms !X  78.468 ms !X  78.407 ms !X

The traffic ends up going through probably Salt Lake City and Boise to get to Sacramento which would explain the high latency. Hopefully Cogent will get this fixed soon.
he.net says that they aren't having an outage, and our reseller's webpage, colocationonline.info, appears down, too, so that's probably where the problem is. 

Nick is driving out there just in case it's something up with our stuff.

and we are back (I think as of 6:00a PST.)

we will have schedules and plans for moving later