Evergreen ILS Website

IRC log for #evergreen, 2025-03-26

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat

All times shown according to the server's local time.

Time Nick Message
07:08 collum joined #evergreen
07:13 book` joined #evergreen
08:43 mmorgan joined #evergreen
08:47 dguarrac joined #evergreen
09:04 Dyrcona joined #evergreen
09:07 Dyrcona So, our dev/training system has action_trigger_runner.pl piling up for the same jobs (i.e. granularity) and never finishing. I didn't count, but there were at least 5 password-reset granularity a/t runners going and at least two that we run every half hour. I hope a vacuum full analyze helps.
09:07 Dyrcona This is with Evergreen 3.14.4
09:10 Dyrcona Oh, and the srfsh jobs all reported that they could not bootstrap the client overnight. The opensrf_core.xml file is in the usual place.
09:11 Dyrcona We've had issues with this system before, but I wonder if anyone else has encountered something like this. I suspect it's just a lack of resources, RAM or CPU, but thought I'd bring it up in case someone else has seen something like this before.
09:21 Dyrcona this vacuum full analyze is taking a long time.
09:22 Dyrcona Caught error from 'run' method: Exception: OpenSRF::EX::JabberDisconnected 2025-03-26T06:37:57 OpenSRF::Application /usr/local/share/perl/5.34.0​/OpenSRF/Application.pm:240 JabberDisconnected Exception: This JabberClient instance is no longer connected to the server
09:22 Dyrcona So, jabber settings, maybe?
09:25 Dyrcona Everything looks OK. I'm going to bump the max stanza size.
09:36 Dyrcona I bumped it to 1MB + 6,144 bytes because it looks like the prior settings were something like 256,000 + 6,144.
09:38 JBoyer srfsh doesn't use opensrf_core.xml directly unless you setup a symlink to it from ~/.srfsh.xml
09:38 JBoyer Though if ~/.srfsh.xml is there and accurate it's hard to say what's wrong
09:39 Dyrcona JBoyer: I'll double check .srfsh.xml. It might be wrong.
09:39 Dyrcona Well, out of date.
09:41 Dyrcona Ah, yeahp. It has the redis password in it and not the ejabberd password.
09:41 Dyrcona JBoyer++
09:41 Dyrcona That explains the srfsh jobs not starting.
09:42 Dyrcona I want the opensrf user's private password in there, right?
09:44 Dyrcona Well, duh. It says so right in .srfsh.xml.
09:44 Dyrcona Oh, and the port needs to change.
09:46 Dyrcona the vacuum full is still chugging away after 40 minutes.
09:59 Dyrcona Either GCP storage is really slow or this database needs a lot of cleaning up....54 minutes and counting.
10:10 Dyrcona Never seen this before: 2025-03-25 13:32:37.409902-04:00 [error] <0.88.0>@ejabberd_system_monitor:do_kill/2:290 Killed 1 process(es) consuming more than 10310 message(s) each
10:13 Dyrcona 2025-03-26 09:01:56.882977-04:00 [info] <0.4136.0>@ejabberd_c2s:process_terminated/2:292 (tcp|<0.4136.0>) Closing c2s session for opensrf@private.localhost/open-ils.supe​rcat_drone_at_localhost_67134: Connection failed: connection closed
10:14 Dyrcona A lot of lines like that. I wonder if that's bots or Aspen?
10:14 Dyrcona We do have a test instance of Aspen talking to this server.
10:35 Rogan Dyrcona am I imagining it or were you looking at pulling digital bookplates into Aspen at some point?
10:41 Rogan ignore ^, found it
10:43 Christineb joined #evergreen
11:15 Dyrcona Two hours and 10 minutes so far. I opened a ticket about the db storage.
11:33 jeff Dyrcona: what type of storage volume? PD, Hyperdisk, or Local SSD?
11:33 jeff and how much data are you rewriting?
11:33 jeff (er, asked another way, "how large is the db on disk?")
11:34 jeff oh, and I forgot about Extreme PDs as an option! :-)
11:35 Dyrcona jeff: According to df; 387GB.
11:36 Dyrcona Hm. Now it says 425GB.
11:39 Dyrcona it's growing while the vacuum full makes copies of tables.
11:45 jihpringle joined #evergreen
11:50 Dyrcona Yeah. Down to 403GB now.
11:55 mantis joined #evergreen
11:57 jihpringle joined #evergreen
12:05 jihpringle38 joined #evergreen
12:08 csharp_ Dyrcona: we ended up setting max_stanza_size to max_stanza_size: 10000000 on our A/T server
12:08 Dyrcona csharp_: I used to set it to 10 * 1024 * 1024.
12:08 csharp_ Dyrcona: you should see the erroring out in the ejabberd logs
12:08 Dyrcona So 10MB
12:09 csharp_ haven't run A/T on redis yet, we'll see if that's better eventually
12:09 Dyrcona yeah, I used to grep for something like stanza too large or whatever. I tried looking for stanza, and just came up with someone named Costanza in the logs.
12:09 berick haha
12:09 * csharp_ is off but couldn't resist responding
12:09 Dyrcona I have run A/T on redis. It's fine.
12:09 csharp_ @who is killing independent George?
12:09 pinesol scottangel_ is killing independent George.
12:10 Dyrcona This machine used to be set up for Redis, but after an update/rebuild in December, it started having issues so we switched back to Ejabberd.
12:10 * Dyrcona is leaning toward the db storage being too slow.
12:10 csharp_ Dyrcona: also something like "large message" in the osrf logs
12:10 Dyrcona yeah...
12:10 Dyrcona I'll look for large message.
12:12 csharp_ we learned from pain that A/T processing needs 1) tons of RAM 2) unreasonably large max_stanza_size on ejabberd and 3) granularity to distribute things more evenly
12:12 csharp_ all of that assumes a fast-ish DB, yes
12:12 Dyrcona Yeah. I've learned that, too.
12:13 Dyrcona Maybe I'll set max_stanza_size to 10MB anyway. There no services running. I want this vacuum to run unhampered.
12:29 Dyrcona It finished after 3 hours and 18 minutes.
12:29 Dyrcona 363GB are used on the database partition.
12:36 Dyrcona I really like Emacs regex replace because you can execute elisp code: \(max_stanza_size: \)[0-9]+ → \1\,(format "%d" (+ 6144 (* 1024 1024 10)))
12:36 Dyrcona \, is used to indicate that the following is elisp code that returns a string.
12:44 Dyrcona csharp_: I finally found "Stream closed by local host: XML stanza is too big (policy-violation)" in the Ejabberd logs.
12:45 Dyrcona open-ils.trigger drone was the last one to do it.
13:04 Dyrcona I haven't found anything in osrfsys.log that obviously corresponds to the XML stanza is too big message, but there are 3,247 log lines for the second of the timestamp of ejabberd message in osrfsys.log.
13:05 Dyrcona The open-ils.trigger stderr log has the timestamp of the minute that the error occurred. I pasted the last entry earlier.
13:06 Dyrcona i think the stderr logs should include timestamps for individual log entries. Maybe I should Lp that?
13:08 Dyrcona i seem to recall having to adjust the log settings in opensrf_core.xml before the ejabberd messages would pass through to osrfsys.log.
13:25 Dyrcona The message from the stderr log shows up in osrfsys.log, so there's tat.
13:26 Dyrcona s/tat/that/
13:37 jihpringle joined #evergreen
13:58 jihpringle joined #evergreen
14:05 jihpringle24 joined #evergreen
14:20 jihpringle joined #evergreen
15:13 Rogan to everyone in channel, I have not sent out a formal announcement yet but we will be looking for a host for this year's Hack-A-Way.  If anyone is interested keep an eye out for the announcement.
15:18 mantis left #evergreen
15:55 Dyrcona Oh crap. We're still under the commit moratorium, aren't we? I totally forgot and pushed something to main, rel_3_13, and rel_3_14.....
15:56 Dyrcona I was just about to add it to rel_3_15 when it struck me what I was doing.
15:57 Dyrcona I think I can actually fix it. let me give it a shot.
15:57 Dyrcona Nope. Maybe that rule only applies to main...
15:58 Dyrcona Eh, no. Only applies to certain repos, not Evergreen.
16:10 abneiman Dyrcona: we are still under the moratorium but it will be lifted by the end of today, hopefully
16:11 abneiman apologies for the process taking longer than usual, but I was last minute out of town at one conference (while presenting at another conference) and we have a lot of people (happily) learning new steps in this process, myself included. Appreciate your patience.
16:14 Dyrcona abneiman: Thanks. I don't think much actual harm is done, since rel_3_15 wasn't touched. Whoever is working on branches should be able to add the db upgrade to main in more or less the usual way. If whoever that is needs help, contact me, and I'll gladly do it.
16:14 abneiman thanks, will do
16:14 mmorgan abneiman++
16:14 mmorgan release_team++
16:14 mmorgan Dyrcona++
16:16 pinesol News from commits: LP#2051946: Add Co-authored-by to commit-template <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=dd23b9​bb54d324ea5e0cb250501857458e4ac9f0>
16:46 jihpringle joined #evergreen
16:49 * Dyrcona is clocking out, but I'll keep an eye on email.
17:25 mmorgan left #evergreen
18:35 abneiman redavis++
18:49 jihpringle joined #evergreen
19:26 jihpringle joined #evergreen
21:28 abowling1 joined #evergreen
22:16 abowling joined #evergreen

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat