Time |
Nick |
Message |
04:48 |
|
eeevil joined #evergreen |
04:48 |
|
akilsdonk_ joined #evergreen |
04:48 |
|
Bmagic_ joined #evergreen |
04:49 |
|
jeffdavi1 joined #evergreen |
04:49 |
|
jeff___ joined #evergreen |
08:45 |
|
Dyrcona joined #evergreen |
09:05 |
|
maryj joined #evergreen |
09:29 |
|
yboston joined #evergreen |
09:35 |
Dyrcona |
If one changes kpac.xml what services need to be restarted? opensrf.settings? |
09:36 |
|
jeff___ joined #evergreen |
09:53 |
Dyrcona |
Great. cstore is not connected to the network, but I'm having a devil of a time firguring out why. |
10:06 |
Dyrcona |
This is weird. All of the servers are using the same opensrf.xml. |
10:06 |
Dyrcona |
Two of the sip servers are blowing up because of the application_name field, but all of the sip servers are running the same version of DBI: 1.612, so I'd expect them all to blow up. |
11:47 |
|
Christineb joined #evergreen |
12:35 |
|
sciani joined #evergreen |
12:43 |
jeff |
Dyrcona: get things all sorted out? |
13:23 |
|
Christineb joined #evergreen |
13:23 |
|
sciani joined #evergreen |
13:24 |
Dyrcona |
jeff: I don't think so. Other things crop up. |
13:41 |
Dyrcona |
So, almost 500GB of pg_xlog files, some timestamped from last night when I started a pingest. |
13:41 |
Dyrcona |
My guess is that a lot of these are junk and will never process, but how to tell? |
13:42 |
jeff |
I would expect a pingest to generate a lot of WAL traffic like that. |
13:44 |
jeff |
Perhaps whatever normally consumes them has run out of disk itself? |
13:50 |
jeff |
(In terms of the target of an archive_command if applicable, or something else) |
14:00 |
Dyrcona |
Yeah. But, where the WAL files are being copied seems to have plenty of space. |
14:10 |
Dyrcona |
And, what logs I can find don't mention running out of space, though every time I google the error, I get results saying how to deal with that or the continuous archiving instructions. |
14:10 |
Dyrcona |
I get two errors, maybe the same, but they come out in two lines. |
14:11 |
Dyrcona |
One says the .bz2 file exists in the destination and the other says a cp failed with exit code = 131. |
14:12 |
Dyrcona |
The space does seem to be freeing up, though. |
14:13 |
Dyrcona |
I guess what worked int training doesn't work in production. :/ |
14:14 |
csharp |
Dyrcona: research pg_resetxlog |
14:14 |
csharp |
(pretty sure that's right) |
14:15 |
csharp |
you'd want to do a base backup right after so you can make sure you're covered |
14:15 |
csharp |
I had to do that during an upgrade several years ago - no fun, but at least we were already down |
14:16 |
jeff |
if your archive_command includes a compression step, it's possible that you ran out of resources other than disk (memory, cpu + time) during the pingest run. |
14:16 |
jeff |
in any event, good luck and proceed carefully. :-) |
14:16 |
csharp |
once the xlog is reset you can remove the old pg_xlog files |
14:17 |
Dyrcona |
csharp: Thanks. I'll look into that. |
14:17 |
Dyrcona |
jeff: Yeah, could be. There's a bzip2 apparently going on, but that cp exit code has me stumped, and the message related to the bzip2 files is that they already exist. Not helpful, I know. |
14:18 |
Dyrcona |
pg_resetxlog sounds drastic from reading the documentation.... |
14:21 |
csharp |
in my case my disk was full and I had no other choice |
14:23 |
Dyrcona |
Yeah, my disk is close to full, but it seems to be clearing up a git on its own. |
14:23 |
jeff |
Dyrcona: any OOM killer logs on the box doing the archiving? |
14:25 |
Dyrcona |
jeff: No, but there are disk errors. :( |
14:25 |
jeff |
ouch. |
14:26 |
jeff |
i repeat my earlier, "good luck and proceed carefully" |
14:27 |
Dyrcona |
jeff: For a drive other than the one where I think the logs are going. |
14:27 |
Dyrcona |
I had to check the partitions on that machine. |
14:27 |
Dyrcona |
failure on sdb, but logs are going to sda, it looks like |
14:28 |
jeff |
what is sdb? |
14:28 |
Dyrcona |
So much nagios junk in the logs. |
14:30 |
Dyrcona |
jeff: It looks like just a mounted drive that's not used, but I'm not so sure now that I have a third look. |
14:32 |
Dyrcona |
It's mount by the db server as /mnt/backups, but /dev/sda8 on the destination machine is mounted as /usr/backups on the db server and that is where backups are going. |
14:33 |
|
Bmagic joined #evergreen |
14:41 |
Dyrcona |
Nothing like a good, old-fashioned hardware failure to go along with the typical upgrade issues. |
15:32 |
Dyrcona |
csharp: Do you use continuous WAL archiving? Or, rather who in here does (that's around to answer)? |
15:46 |
|
remingtron_ joined #evergreen |
15:51 |
|
dbwells joined #evergreen |
16:11 |
|
remingtron joined #evergreen |
19:24 |
|
dcook joined #evergreen |