Time |
Nick |
Message |
02:40 |
|
alynn26 joined #evergreen |
06:02 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |
07:15 |
|
rjackson_isl_hom joined #evergreen |
07:24 |
csharp |
jeffdavis++ |
08:04 |
|
rfrasur joined #evergreen |
08:06 |
|
collum joined #evergreen |
08:11 |
|
mantis2 joined #evergreen |
08:23 |
|
Dyrcona joined #evergreen |
08:37 |
|
mmorgan joined #evergreen |
08:57 |
|
mmorgan joined #evergreen |
10:14 |
|
Cocopuff2018 joined #evergreen |
11:13 |
|
collum_ joined #evergreen |
11:17 |
|
nfBurton joined #evergreen |
11:21 |
Dyrcona |
Am I the only who cringes when reading old bug comments that they've made? |
11:22 |
mmorgan |
Dyrcona: It depends on the comment :) |
11:22 |
nfBurton |
Oh, I've been there. |
11:22 |
Dyrcona |
mmorgan++ Indeed, it does. |
11:25 |
rhamby |
I've been known to cringe a time or two. |
11:35 |
Dyrcona |
I'm going through the list of security bugs, and I'm not certain that I agree with all of them being classes as "security" issues. :) |
11:56 |
csharp |
Dyrcona: yes, I cringe |
11:58 |
csharp |
@whocares launchpad |
11:58 |
pinesol |
paxed hates launchpad |
11:59 |
csharp |
@seen paxed |
11:59 |
pinesol |
csharp: I have not seen paxed. |
12:00 |
jeffdavis |
I've definitely filed some "this could be used to DoS an EG instance" type bugs as private security bugs as a precaution, and would be fine if they were recategorized. |
12:11 |
|
jihpringle joined #evergreen |
12:17 |
Dyrcona |
pinesol: I've seen paxed. You just need to hang out in #koha on OFTC. |
12:17 |
pinesol |
Dyrcona: http://wonder-tonic.com/geocitiesizer/content.php?theme=2&music=6&url=evergreen-ils.org |
12:18 |
Dyrcona |
jeffdavis: Yeah, I'm OK with those being security bugs. I was thinking of a couple of others such as removing a deprecated field and being able to bypass hold limits. |
12:19 |
Dyrcona |
I also disagree with the last being a bug. It's a feature! :) |
12:19 |
Dyrcona |
https://bugs.launchpad.net/evergreen/+bug/1017990 |
12:19 |
pinesol |
Launchpad bug 1017990 in Evergreen 3.6 "Possible to bypass holds placement limits via direct API calls" [Medium,Confirmed] |
12:22 |
jeffdavis |
my initial reaction on that was if a patron figures out how to do that, they've kind of earned it ;) |
12:22 |
jeffdavis |
(but tried to fix it anyway) |
12:30 |
Dyrcona |
Well, there's a private one that may have been fixed upstream, though I seem to recall thinking a patch on our end would be relatively simple. |
12:31 |
Dyrcona |
Oh. I remember. It was supposed to be simple, but turned out not to be. :) |
12:40 |
Dyrcona |
So many related bugs...We have to check if some of these security issues were fixed as a side effect of something else. |
12:42 |
jeffdavis |
looks like open-ils.actor.user.itemsout.notices is another open-ils.actor method that needs a batch version to avoid drone exhaustion - seems like I'm running into a new one of these every day |
12:44 |
Dyrcona |
jeffdavis++ |
12:48 |
jeffdavis |
What are people setting open-ils.actor max_children at these days? I've got it up to 200 and that's still not enough, I'm hesitant to raise it again because we'll start hitting resource limits on the servers. |
12:52 |
Dyrcona |
jeffdavis: Mine 150 but we run two drone servers per brick, so effectively 300. |
12:55 |
|
sandbergja joined #evergreen |
12:57 |
jeffdavis |
Do requests get distributed across both drone servers? Like if you look up a user's circs in the client, would both drone servers get the open-ils.actor.user.itemsout.notices requests or would they all go to just one of the drone servers? |
12:58 |
Dyrcona |
That's a good question, and I'm not sure that I can answer it. |
12:59 |
Dyrcona |
Requests do get spread out across both drone vms. Usually when there's a problem, one is at 150 children and the other the same or very close, like 149 or so. |
13:00 |
Dyrcona |
I just don't know enough to answer about your specific case. |
13:03 |
jeffdavis |
We don't use a brick setup. Generally a burst of requests tied to a single action in the client all go to the same server rather than being distributed among the 4 servers, so we'll often hit max children on one server while the others may be relatively low. |
13:07 |
Dyrcona |
Well, it seems like the same or similar requests from 1 client hit mostly hit one brick within a certain amount of time. For us, a brick is 3 servers: the "head" running nginx and apache, and 2 drones running the opensrf services. |
13:07 |
Dyrcona |
We have 6 bricks all together. |
13:08 |
Dyrcona |
With a load balancer (ldirectord) in front. |
13:12 |
jeffdavis |
We have 4 self-contained application servers (nginx+apache+opensrf on each) behind a load balancer. |
13:12 |
Dyrcona |
There's also an issue with how many pcrud and cstore drones are set up to run because most of the services end up talking to them to get to the database. |
13:13 |
Dyrcona |
We can get by on 5 bricks, but add a 6th last year most of our members switched to the web staff client from XUL. |
13:14 |
Dyrcona |
We saw exactly what you're talking about with open-ils.actor calls. |
13:14 |
Dyrcona |
I'm in favor of more batch calls. |
13:15 |
jeffdavis |
Yeah, the trouble for us is that we can add more servers, but that won't avoid maxing out drones on any given one, which affects anyone else who hits that server for the duration. |
13:16 |
Dyrcona |
Yeah, same here. We stopped at 6. |
13:16 |
jvwoolf1 |
jeffdavis: Dyrcona: Was just following along. What kinds of problems would you see if you were hitting up against the max child limit? |
13:16 |
Dyrcona |
It's not too expensive when using VMs for the bricks. |
13:17 |
jeffdavis |
jvwoolf1: with open-ils.actor exhaustion, certain UIs don't load properly, like parts of the screen remain blank |
13:17 |
Dyrcona |
Yeahp. |
13:18 |
Dyrcona |
Users complain its sloooooowwww. |
13:18 |
jvwoolf1 |
Hmm |
13:19 |
jvwoolf1 |
I just looked it up and ours is at 40 for each brick |
13:19 |
jvwoolf1 |
And we've been having those complaints lately |
13:19 |
Dyrcona |
There's something you can look for in the logs. Give me a couple of minutes. |
13:19 |
jeffdavis |
"no children available" |
13:20 |
jvwoolf1 |
jeffdavis++ |
13:20 |
jvwoolf1 |
To the logs! |
13:20 |
jeffdavis |
we've also seen "Service open-ils.actor unavailable" in the browser console when using the client |
13:21 |
jeffdavis |
not 100% sure that was due to drone exhaustion, "no children available" in server logs is more definitive |
13:21 |
Dyrcona |
Yeah, that's it. |
13:21 |
jvwoolf1 |
What other settings to you have for openils.actor, if you don't mind me asking? |
13:22 |
|
sandbergja joined #evergreen |
13:23 |
pastebot |
"jeffdavis" at 168.25.130.30 pasted "open-ils.actor config" (22 lines) at http://paste.evergreen-ils.org/10064 |
13:23 |
jeffdavis |
jvwoolf1: ^ |
13:25 |
jeffdavis |
I see max_requests is a little low there actually |
13:25 |
jeffdavis |
not sure what effect that has, we certainly get all the way up to max_children regularly |
13:26 |
Dyrcona |
Well there are two max_requests, and I'm wondering which one actually works. We have 93 and 1000 in the outer and inner one respectively. |
13:27 |
Dyrcona |
We get there from time to time, but I stopped paying much attention after the complaints died down. It seems to happen in spurts, like for a day or two and then nothing for a while. |
13:27 |
jvwoolf1 |
Dyrcona: We have the same |
13:28 |
jvwoolf1 |
And I was wondering the same thing. |
13:28 |
Dyrcona |
Well, looks like it is happening today, but no one is complaining. |
13:28 |
Dyrcona |
I guess it still happens all the time. |
13:29 |
Dyrcona |
I think the inner one takes precedence, but I'd have to read the settings code again. |
13:29 |
jeffdavis |
outer one is "max stateful REQUEST requests before a session automatically disconnects a client", inner one is "max requests per process backend before a child is recycled" according to comments in opensrf |
13:30 |
Dyrcona |
Yeah. Seems like they should have different tags/names. |
13:31 |
Dyrcona |
Only there are only 2 hard problems in computer science: cache invalidation, naming things, and off by 1 errors. |
13:32 |
jeffdavis |
in any case, nothing to do with max_children |
13:32 |
jvwoolf1 |
jeffdavis++ Dyrcona++ |
13:32 |
jvwoolf1 |
Looks like we're getting that error pretty frequently |
13:32 |
jvwoolf1 |
With open-ils.supercat too |
13:33 |
Dyrcona |
Well, it looks really bad, when there are hundreds or thousands of entries in a minute from the same brick, but in our case, it isn't as bad as it looks. |
13:36 |
|
mmorgan1 joined #evergreen |
13:37 |
jvwoolf1 |
Still, since we're getting slowness complaints, looks like it's a good idea to increase it for us. |
13:38 |
|
mantis2 joined #evergreen |
13:41 |
jvwoolf1 |
jeffdavis: Dyrcona: Can I ask about your open-ils.supercat max child settings? |
13:41 |
jeffdavis |
50 for us, which we very rarely hit |
13:43 |
Dyrcona |
Ours is 40 total. |
13:45 |
jvwoolf1 |
Wow, I wonder why ours its getting hit so frequently. It's 50. |
13:45 |
Dyrcona |
Do your libraries use a log of public buckets? |
13:46 |
Dyrcona |
I think those go through supercat as do the various feeds. |
13:46 |
jvwoolf1 |
We have them set up for local awards, so yeah, we have them linked right at the top of the catalog |
13:47 |
Dyrcona |
That could be part of the reason. |
13:49 |
jvwoolf1 |
That makes sense. I'll see about increasing that limit too. We just had a hardware upgrade, so our servers should be able to handle it. |
13:50 |
Dyrcona |
Our drone servers are almost never under any load, even when the actor drones are maxed out. |
13:56 |
jvwoolf1 |
Dyrcona: Good to know |
14:30 |
mantis2 |
Just wanted to ask about circulation stats when using the Emergency Closing Handler. |
14:31 |
mantis2 |
I did some circulation actions prior to closing for a couple of days and immediately processing these dates, then I did some circulation actions afterwards. But I can't see any numbers accumulating under the date's stats in Closed Dates Editor. |
14:31 |
mantis2 |
Is there a different way to get this information? |
14:33 |
csharp |
mantis2: not sure I'm understanding... you did some circs, closed the branch, then did more circs? |
14:34 |
mmorgan |
mantis2: Only circulations whose due dates fall on the emergency closure dates will be affected by the emergency closure processing. Do you have due dates that fall on those closed dates? |
14:34 |
csharp |
mantis2: also beware bug 1869728 |
14:34 |
pinesol |
Launchpad bug 1869728 in Evergreen "Emergency closure processing fails if overlap existing closed date" [High,Confirmed] https://launchpad.net/bugs/1869728 |
14:39 |
mantis2 |
mmorgan: I didn't but I appreciate the guidance |
14:40 |
mantis2 |
For whatever reason I was lost on how to test it |
14:40 |
mantis2 |
But I think I understand now thank you both |
14:40 |
mantis2 |
mmorgan ++ |
14:40 |
mantis2 |
csharp++ |
15:08 |
|
collum joined #evergreen |
15:19 |
|
Cocopuff2018 joined #evergreen |
15:32 |
|
mantis2 left #evergreen |
15:54 |
|
sandbergja joined #evergreen |
15:55 |
pastebot |
"Bmagic" at 168.25.130.30 pasted "Brick monitoring script" (15 lines) at http://paste.evergreen-ils.org/10065 |
15:55 |
Bmagic |
jeffdavis Dyrcona jvwoolf1 - FWIW ^^^ |
15:57 |
Bmagic |
That is the main stuff from our full setup: https://github.com/mcoia/mobius_evergreen/tree/master/monitoring-tools/egstats |
16:00 |
Bmagic |
gives this result: https://ibb.co/2vDrVWw |
16:02 |
jvwoolf1 |
Bmagic++ |
16:02 |
Dyrcona |
Bmagic: Nice! |
16:02 |
Dyrcona |
Bmagic++ |
16:59 |
|
collum joined #evergreen |
17:01 |
|
collum_ joined #evergreen |
17:03 |
|
mmorgan left #evergreen |
17:41 |
|
jvwoolf1 left #evergreen |
18:00 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |
18:30 |
|
yar joined #evergreen |
18:57 |
|
yar joined #evergreen |
22:14 |
|
sandbergja joined #evergreen |
22:53 |
|
sandbergja joined #evergreen |