Time |
Nick |
Message |
06:40 |
|
rlefaive joined #evergreen |
07:44 |
|
mrpeters joined #evergreen |
07:44 |
|
JBoyer joined #evergreen |
08:02 |
* csharp |
wishes he had a dime for every time he says "God, I f*-ing hate reports" under his breath |
08:24 |
tsbere |
csharp: Why are you wishing small? Wish for $100 for every time. :P |
08:28 |
jeff |
if i had a nickel for every time someone had intentionally run an Evergreen reporter report in the last year, I'd have... hrm. $0.05. |
08:28 |
|
collum joined #evergreen |
08:29 |
* tsbere |
comments on bug 1486592 but doesn't examine it much more closely |
08:29 |
pinesol_green |
Launchpad bug 1486592 in Evergreen "Copies in concerto data should have prices" [Wishlist,Confirmed] https://launchpad.net/bugs/1486592 |
08:31 |
|
Dyrcona joined #evergreen |
08:36 |
jeff |
based on the longer-than-usual delay between "job started" and "job's done" emails hitting my inbox, i think our statewide resource sharing catalog bibs are loading properly again. |
08:42 |
Dyrcona |
jeff: That's good. I usually have those jobs spit out what is going on, including any errors. |
08:42 |
Dyrcona |
Some jobs I only have it send me the errors. |
08:43 |
|
rjackson_isl joined #evergreen |
08:45 |
jeff |
yeah, this is on a system we don't control. we just deliver the files, and at the usual time each day i get a series of three emails for each file. |
08:45 |
jeff |
begin, success (with a detailed log), end |
08:45 |
jeff |
or sometimes: begin, no files to load, end |
08:47 |
jeff |
and every so often, begin, failure, end. |
08:47 |
jeff |
good to have checks for things like "are we sending them files" and "are they loading files" |
08:47 |
jeff |
but what i was not checking for was "is the file we are sending them identical to the last file we sent them?" |
08:48 |
|
mmorgan joined #evergreen |
08:55 |
|
rlefaive joined #evergreen |
09:00 |
Dyrcona |
I've got one that checks for zero byte files, 'cause we may not have added new records on a weekend. |
09:01 |
Dyrcona |
But, I don't worry if the files are identical, because they shouldn't be. That would typically mean going a month without us adding or deleting records and that's inconceivable. :) |
09:03 |
jeff |
yeah, in this case it was a sequence conflict on a state table that's used to generate incrementals. |
09:03 |
jeff |
meaning no new entries in the state table, so for the past several mornings the update file had been identical. |
09:04 |
jeff |
so, the better thing to check going forward will be "are there new entries in the state table?" :-) |
09:04 |
Dyrcona |
Sounds like it, but I don't know exactly what your process is. |
09:05 |
jeff |
Dyrcona: it's one of those processes that i wouldn't know exactly what the process is if i didn't have it written down. |
09:05 |
Dyrcona |
:) |
09:06 |
Dyrcona |
If it has to do with ILL, I thought I knew the process, but now I'm not so sure. |
09:06 |
jeff |
shifting gears a bit, for those on Apache 2.4 did you find that you need to further limit the number of requests each Apache child processes before being recycled? |
09:06 |
Dyrcona |
Let me check. |
09:07 |
jeff |
as in, use a higher value for MaxConnectionsPerChild / MaxRequestsPerChild than before? |
09:07 |
Dyrcona |
I think we had increased things dramatically, but when we split so public and staff were on different bricks, we had to let Apache drop to defaults on public to avoice crashes. |
09:07 |
jeff |
default is 0, meaning no limit to "the amount of memory that process can consume by (accidental) memory leakage" :-) |
09:09 |
jeff |
on Debian Jessie with Apache 2.4 and recent master, I'm able to run a 4 GB machine out of memory rather quickly. I'm upping the RAM for the VM, but adjusting MaxConnectionsPerChild does seem to quite effectively resolve the symptoms. |
09:10 |
Dyrcona |
jeff: Yeah, on public I recently set MaxConnectionsPerChild to 1000, MaxSpareServers 20, MaxRequestWorkers 120, the rest are at defaults. |
09:11 |
Dyrcona |
By recently, I mean the file was last changed on Dec. 1. |
09:11 |
* jeff |
nods |
09:11 |
jeff |
thanks for looking! |
09:11 |
Dyrcona |
IIRC, it took a few weeks of tinkering. |
09:11 |
Dyrcona |
I'll check our staff brick, too. It's configuration is different. |
09:11 |
jeff |
do you recall if "hey, we're running out of memory" was the primary motivation behind the change? |
09:12 |
Dyrcona |
Yes, it was. |
09:12 |
Dyrcona |
Sometimes it was running out of memory. Other times, the box looked mostly OK, but load was stratospheric. |
09:13 |
Dyrcona |
This VM has 24GB of RAM configured, too. :) |
09:13 |
Dyrcona |
Presently, it is using 6.6GB, or 3.0GB +/- buffers and cache. |
09:14 |
Dyrcona |
It also has 24GB of swap, and yes we were seeing all of it consumed. |
09:15 |
Dyrcona |
We could probably drop the RAM down to 16GB or maybe as low as 8GB at this point. |
09:15 |
|
jwoodard joined #evergreen |
09:16 |
Dyrcona |
The Apache vm on the staff brick has the same memory configuration, but is using 12GB (11GB +/- buffers and cache). |
09:16 |
Dyrcona |
They're both using about 10MB of swap, btw. |
09:17 |
jeff |
which mostly means that something pressured things to swap at some time probably long ago. :-) |
09:17 |
Dyrcona |
Yes, I don't worry about swap usage much. It doesn't seem to affect performance until you hit about 75%. |
09:18 |
Dyrcona |
It's probably just some data that was needed at startup and *might* be needed later. |
09:18 |
Dyrcona |
So, for the staff brick things are a bit different. |
09:19 |
Dyrcona |
We StartServers 100, MinSpareServers 25, MaxSpareServers 150, MaxRequestWorkers 500, and MaxConnectionsPerChild 1000. |
09:20 |
Dyrcona |
As I recall, the default for MaxRequestWorkers is 256, and I was seeing us starting to run out of RAM on the public side with around 150 Apache processes going. |
09:20 |
Dyrcona |
Thus, I chose 120 for public. |
09:21 |
Dyrcona |
We've found that the use patterns for the OPAC and the staff client are very different. |
09:21 |
Dyrcona |
So, splitting the bricks and configuring Apache differently for each has really helped. |
09:22 |
Dyrcona |
The other thing is, we had a lot of idle Apache processes on the public side with our original settings. |
09:22 |
Dyrcona |
We have fewer idlers now. |
09:24 |
jeff |
since in my experience "bricks" often gets applied to different things, can you clarify your usage? :-) |
09:26 |
tsbere |
jeff: We have two 3 vm "bricks", one for public and one for staff. |
09:26 |
tsbere |
(we also have single utility and sip vms) |
09:27 |
tsbere |
No load balancing or anything |
09:28 |
jeff |
with no load balancing, what components run on each of the three VMs in a given brick? |
09:28 |
jeff |
is public apache all on one of the three VMs in the public brick? |
09:29 |
jeff |
or am i failing to understand your answer? |
09:29 |
tsbere |
For both 3 vm bricks Apache is one vm, ejabberd/router the second (settings as well), all other drones on the third |
09:29 |
jeff |
okay, got it. |
09:31 |
Dyrcona |
Yeah, not really what people think of as the traditional brick setup. |
09:32 |
|
RoganH joined #evergreen |
09:32 |
jeff |
'swhy i asked. :-) |
09:32 |
Dyrcona |
It seems to work for us that way, and certainly better than when we ran it all on 1 machine. |
09:32 |
* jeff |
nods |
09:33 |
jeff |
how many physical machines are you spread across now? |
09:33 |
Dyrcona |
'Course we later found out that the RAID on that 1 machine was switching between spares because one of the main drives had died. |
09:33 |
Dyrcona |
Counting the database server, 3. |
09:34 |
Dyrcona |
There's one physical machine for each "brick." |
09:35 |
jeff |
and the sip and utility VMs are shoved in there somewhere also? |
09:35 |
Dyrcona |
Yes. The utility runs on the staff brick hardware. |
09:36 |
Dyrcona |
And sip runs on the public side. |
09:36 |
* Dyrcona |
had to check to make sure. |
09:36 |
jeff |
heh |
09:36 |
|
jvwoolf joined #evergreen |
09:36 |
Dyrcona |
tsbere is away from his desk. He would just know the answer. |
09:37 |
Dyrcona |
Four vms on each physical machine. |
09:37 |
tsbere |
Note that the sip/utility vms are standalone bricks themselves |
09:37 |
tsbere |
Also, if you want to count NFS/Logging there are 4 machines |
09:37 |
jeff |
heh |
09:38 |
jeff |
nfs/logging on a fourth physical machine? did you stick memcached there also, or somewhere else? |
09:38 |
tsbere |
I think I stuck memcached on the DB server, actually |
09:38 |
Dyrcona |
Is memcached still running on the db server? |
09:38 |
Dyrcona |
heh |
09:38 |
Dyrcona |
I forgot that sip and utility run their own ejabberd. I remembered that they do run drones and listeners. |
09:39 |
* tsbere |
wanted the "public" side to work even when the NFS box was down, so it has copies of all the configs instead of symlinks |
09:39 |
tsbere |
That includes sip, the utility vm and the staff brick all use NFS-hosted files directly |
09:39 |
Dyrcona |
The logging box would be an OK place for memcached, though. |
09:39 |
jeff |
with a single staff VM for all staff apache processes, what kinds of things do you end up using NFS for? |
09:40 |
jeff |
usual things like reporter output don't seem to apply, though maybe if you're running clark on the utility VM... |
09:40 |
tsbere |
jeff: Well, for starters, reports and utility stuff end up moving from the utility VM to the staff apache VM |
09:41 |
Dyrcona |
So, yeah, we do run clark on the utility vm. |
09:41 |
tsbere |
jeff: Also, I think there is at least one part of the system that apache writes files but drones read them, thus those VMs need to talk to NFS |
09:47 |
|
maryj joined #evergreen |
09:51 |
jeff |
tsbere: ah, probably something in vandelay? |
09:51 |
* jeff |
tries to think |
09:52 |
tsbere |
jeff: Don't recall if it was vandelay or offline circ |
09:52 |
tsbere |
That being a programmer's or, meaning it could be both. :P |
09:53 |
jeff |
heh |
10:01 |
|
yboston joined #evergreen |
10:01 |
|
rlefaive_ joined #evergreen |
10:20 |
|
Christineb joined #evergreen |
10:33 |
|
Guest16800 left #evergreen |
12:01 |
|
rlefaive joined #evergreen |
12:02 |
|
bmills joined #evergreen |
12:09 |
|
jihpringle joined #evergreen |
12:49 |
pinesol_green |
[evergreen|Jason Stephenson] LP 1499123: Add release notes. - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=eabd816> |
12:49 |
pinesol_green |
[evergreen|Jason Stephenson] LP 1499123: Modify Perl code for csp.ignore_proximity field. - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=831a808> |
12:49 |
pinesol_green |
[evergreen|Jason Stephenson] LP 1499123: Add ignore_proximity to config.standing_penalty. - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=63205ed> |
13:23 |
|
bmills1 joined #evergreen |
13:32 |
|
bmills joined #evergreen |
13:53 |
berick |
Dyrcona: FYI @ http://git.evergreen-ils.org/?p=working/random.git;a=shortlog;h=refs/heads/collab/berick/pingest -- adding some options to your record ingest script. let me know if you ever put it on github or similar, i'll send a pull request. |
13:55 |
Dyrcona |
berick: Cool! I'll have to give your changes a try some time, soon. |
14:09 |
|
Stompro joined #evergreen |
14:09 |
kmlussier |
@hate scheduling meetings |
14:09 |
pinesol_green |
kmlussier: But kmlussier already hates scheduling meetings! |
14:10 |
berick |
@reallyhate |
14:10 |
pinesol_green |
berick: Try restarting apache. |
14:10 |
kmlussier |
@loathe scheduling meetings |
14:10 |
pinesol_green |
kmlussier: http://www.firstpersontetris.com/ |
14:15 |
Dyrcona |
git apply is not being my friend. |
14:17 |
Dyrcona |
I downloaded bericks change above as a patch but git refuses to apply it. |
14:17 |
Dyrcona |
At first it complains about a whitespace change. |
14:17 |
Dyrcona |
After I fix that it, it says nothing but the file is upatched. |
14:18 |
berick |
maybe a -p or --directory param? |
14:18 |
berick |
the path is different |
14:18 |
Dyrcona |
Yeah, I did it in the directory where the file lives. I'll try -p. |
14:21 |
Dyrcona |
-p 1 didn't help, but doing it from my root and secifying --directory=./perl did. |
14:22 |
berick |
cool |
14:26 |
Dyrcona |
Patch applied in a test branch. I'll have to test it later this week. |
14:27 |
jeff |
oh hey, i got this to happen again: |
14:28 |
jeff |
OpenSRF Drone [open-ils.search] |
14:28 |
jeff |
\_ OpenSRF Drone [open-ils.search] |
14:28 |
jeff |
\_ OpenSRF Drone [open-ils.search] |
14:32 |
jeff |
alas: |
14:32 |
jeff |
Your search - "opensrf listener" "gets confused" "thinks it's a drone" - did not match any documents. |
14:33 |
gmcharlt |
jeff: does it seem to actually be confused, or is it just that the process name is wrong? |
14:33 |
jeff |
ejabberd sends it a message, it ignores it. |
14:33 |
* Dyrcona |
wonders if it didn't die and the oldest child somehow got promoted, not sure if each becomes its own process group leader. |
14:34 |
* Dyrcona |
thinks they do, so that shouldn't happen, but.... |
14:34 |
jeff |
pid and process start time indicate that it is the Listener process, but it has changed name and behavior (but not pid or process start time) |
14:34 |
Dyrcona |
jeff: OK. That is strange. I've never seen that happen. |
14:35 |
jeff |
unusual circumstances: i've been intentionally running this machine out of memory by hammering it with tpac search requests, and OOM killer has killed apache processes. |
14:35 |
Dyrcona |
Right. I thought that might be part of it. |
14:36 |
jeff |
debian jessie, recent master of both opensrf and evergreen. |
14:39 |
Dyrcona |
Oh, beauty... git log -p reveals I have a byte order marker in a SQL query. |
14:39 |
jeff |
i can't reliably reproduce outside of the "i've done it twice in the past two days" |
14:39 |
Dyrcona |
Well, you're deliberately over stressing the system. I'd expect the listener to just die, though, and not start acting like a drone. |
14:40 |
jeff |
i'm wondering if the behavior could result from a failed attempt to fork. |
14:40 |
jeff |
that might be way off base, though. |
14:41 |
Dyrcona |
Query still works, though. |
14:42 |
Dyrcona |
I'm not sure what would happen in that case. |
14:42 |
Dyrcona |
Typically, if fork fails, you exit your program. |
14:43 |
jeff |
oh, that's exactly what's happening. |
14:43 |
jeff |
$child->{pid} = fork(); |
14:43 |
jeff |
perl fork() returns undef if the fork failed. |
14:43 |
jeff |
if($child->{pid}) { # parent process |
14:43 |
jeff |
... |
14:43 |
Dyrcona |
Yes. |
14:43 |
jeff |
} else { # child process |
14:43 |
Dyrcona |
That looks like the culprit. Good catch! |
14:44 |
jeff |
and in that else $child->{pid} gets set to $$ and we eval $child->init, which is where $0 gets set to OpenSRF Drone [$service] |
14:46 |
jeff |
we should probably treat 0 and undef differently. |
14:46 |
Dyrcona |
Yes, definitely. |
14:47 |
Dyrcona |
It would probably be safe to log the failure to fork a new child and then do nothing if the result of fork is undef. |
14:47 |
* jeff |
nods |
14:48 |
jeff |
of course, that could lead to a situation where we chew CPU because there's no memory, so perhaps a sleep or some other backoff would be suitable also. |
14:48 |
jeff |
but at that point, you're probably already in deep trouble. |
14:50 |
Dyrcona |
Yeah, when you're out of resources and can't fork a process, it might be a good idea to hang it up and go home. :) |
14:55 |
kmlussier |
Calling 0951 |
15:01 |
* Dyrcona |
decided to fire off a pingest.pl test on his dev vm anyway. |
15:02 |
|
krvmga joined #evergreen |
15:03 |
pinesol_green |
[evergreen|Kathy Lussier] LP 1499123: Stamping upgrade script for standing-penalty-ignore-proximity - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=c1b64bf> |
15:06 |
|
mmorgan1 joined #evergreen |
16:01 |
|
jlitrell joined #evergreen |
16:05 |
Dyrcona |
berick++ I added and pushed your patch for pingest.pl. |
16:05 |
pinesol_green |
[evergreen|Chris Sharp] LP#1486592 - Generate prices for concerto dataset. - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=05e9a08> |
16:06 |
berick |
Dyrcona++ |
16:07 |
|
mmorgan joined #evergreen |
16:08 |
|
vlewis joined #evergreen |
16:09 |
|
rlefaive joined #evergreen |
16:23 |
JBoyer |
gmcharlt++ # Helping Clark overcome amnesia |
16:50 |
|
vlewis_ joined #evergreen |
16:53 |
|
vlewis joined #evergreen |
16:57 |
|
vlewis_ joined #evergreen |
17:07 |
|
ddale joined #evergreen |
17:07 |
|
mmorgan left #evergreen |
17:07 |
kmlussier |
gmcharlt: I was just looking at bug 1067823 again. Do you know why MARC tag 659 was added to the definition for the genre field? I would have thought we would be using the mods definition there, and I don't see any documentation that points to 659 being a genre field. |
17:07 |
pinesol_green |
Launchpad bug 1067823 in Evergreen "tpac: genre links in record details page launch subject search" [Medium,Confirmed] https://launchpad.net/bugs/1067823 - Assigned to Kathy Lussier (klussier) |
17:10 |
gmcharlt |
kmlussier: as near as I can tell, it's a legacy of the NLM using 659 for genre back in the days when dinosaurs roamed the earth |
17:11 |
gmcharlt |
and my including it in that patch was basically just cargo-culting fromOpen-ILS/src/templates/opac/parts/record/subjects.tt2 |
17:11 |
gmcharlt |
all of that said: I agree it's non-standard, and I certainly have no object to just sticking with 655 |
17:11 |
gmcharlt |
*objectin |
17:11 |
gmcharlt |
*objection |
17:11 |
gmcharlt |
@coffee me |
17:11 |
* pinesol_green |
brews and pours a cup of Bonsai Blend Espresso, and sends it sliding down the bar to me |
17:14 |
kmlussier |
gmcharlt: OK, I'll play with that branch a bit. |
17:14 |
kmlussier |
@coffee gmcharlt |
17:14 |
* pinesol_green |
brews and pours a cup of Ethiopia Yirgacheffe, and sends it sliding down the bar to gmcharlt |
17:14 |
kmlussier |
Good luck sleeping on that |
17:14 |
* gmcharlt |
buzzes |
17:40 |
kmlussier |
I actually see two records in a production system with a 659. But, in one case, I'm pretty sure it's a mistake since I've never heard of a genre called "shark attacks" |
17:41 |
gmcharlt |
kmlussier: you must not watch the Syfy channel ;) |
17:41 |
gmcharlt |
but yeah, 2 records is just noise |
18:28 |
|
jihpringle_ joined #evergreen |
18:30 |
|
dluch_ joined #evergreen |
18:32 |
|
dbs_ joined #evergreen |
18:32 |
|
berick_ joined #evergreen |
18:37 |
jwoodard |
@decide haiku or no |
18:37 |
pinesol_green |
jwoodard: go with haiku |
18:44 |
jwoodard |
Warm wind blowing free, February moves onward, birds chirp happily. |
18:44 |
|
Bmagic joined #evergreen |
18:44 |
|
hopkinsju joined #evergreen |
18:44 |
|
dluch joined #evergreen |
18:45 |
|
_bott_ joined #evergreen |
19:53 |
|
book` joined #evergreen |
20:15 |
jlitrell |
Ahh, February / Snow, snow, snow, rain, rain, snow, rain / I can't feel my feet. |
20:17 |
jeff |
heh |
20:17 |
jeff |
jwoodard++ jlitrell++ |
20:27 |
|
bmills joined #evergreen |
21:55 |
|
scrawler joined #evergreen |
21:56 |
scrawler |
anybody here this evening? |
21:57 |
scrawler |
see you later... |
21:57 |
|
scrawler left #evergreen |
21:57 |
jeff |
patience... |
22:09 |
|
phil___ joined #evergreen |