Evergreen ILS Website

Search in #evergreen

Channels | #evergreen index




Results

Result pages: 1 2 3 4 5 6 7

Results for 2024-03-08

10:02 jvwoolf joined #evergreen
10:44 sandbergja joined #evergreen
11:13 kworstell-isl joined #evergreen
11:43 * Dyrcona is trying to test marc_stream_importer, and I'm pretty sure that I have it set up correctly. When I try to cat a marc file at it with nc, the record never queues, and I get no error messages.
11:43 Dyrcona I can see the connection to the importer in the logs, but nothing else that looks related.
11:54 Dyrcona There's nothing in the log after the connection except stream importer saying it is startng a child. No osrf syst activity. nada.
12:02 jihpringle joined #evergreen

Results for 2024-01-24

10:43 Bmagic Stompro: in this example, eight
10:45 Stompro I was planning on looking at how to turn marc_export into a multi process exporter... but I'm only a part time programmer, not sure when I'll get back to that.
10:46 Dyrcona marc_export would be faster if your run multiple instances. Just split your ID file up and run multiple exports, then splice them together at the end.
10:46 Dyrcona you can just cram binary MARC files together. XML would require a bit more work.
10:47 Bmagic Dyrcona: yes, that comes to mind. Splitting the ID's then running 8 processes of marc_export. Then we'd need to combine the outputs at the end. My extractor is here: https://github.com/mcoia/mobius_e​vergreen/tree/master/bib_extract
10:47 Bmagic be gentle, that code is old and junky
10:48 Bmagic It keeps working for me, so I've not prioritized it's rewrite

Results for 2024-01-23

15:09 Dyrcona Pg 15 mostly.
15:09 Dyrcona Are you having issues?
15:10 Bmagic oh good. I have done the same recently. It worked (pg 15 for me too). Though, there were many records it threw some console errors about. It still resulted in a marc file. And that file contained errors according to MARCEdit, which it happily stripped out for me using the validator tool.
15:11 Dyrcona What console errors? I'm not sure that has to do with the Pg version so much. It's more likely down to character set issues in the MARC.
15:11 Bmagic the DB is "C" and UTF-8, same as it was on PG10. I'm troubleshooting an export for a VuFind instance. VuFind doesn't like the export (all of a sudden) - one change we made was upgraded to pg15 from 10. Just trying to rule that out as a possible issue. I think it's just plain bad records that were introduced recently. and the pg version is a red herring
15:12 Bmagic yep, character set issues. Which, we're no strangers to. But the underlying DB version could play a role.
15:13 Dyrcona Did you upgrade Ubuntu, too? There was an Ubuntu upgrade that required reindiexing the database or something because the Unicode library version changed.
15:30 Dyrcona I might as well start one of them now.
15:31 Dyrcona I should also make sure that they're using the same marc_export.
15:32 jvwoolf joined #evergreen
15:35 Dyrcona Bmagic: you are dumping binary MARC with encoding UTF-8?
15:36 Dyrcona I've got one that, for some reason, dumps XML then uses yaz-marcdump to convert it to binary MARC.
15:37 Dyrcona XML would be easier to compare.
15:37 Dyrcona ...Even if the file is bloated.
15:38 * Dyrcona started a binary dump already and decides to let it go.

Results for 2023-12-22

09:43 Dyrcona We found numerous "bad" bibs as a part of this project.
09:44 Dyrcona Also: https://bugs.launchpad.net/eve​rgreen/+bug/2045440/comments/7
09:44 pinesol Launchpad bug 2045440 in Evergreen "Add Public Copy Notes to MARC export when items requested" [Wishlist,New] - Assigned to Jason Stephenson (jstephenson)
09:45 Dyrcona Stompro: Audience and literary form should be properly encoded in the MARC. I think it would be bad to have different Evergreen sites with customized exports for Aspen, etc. We really should be doing "standard" things as much as possible.
09:59 sleary joined #evergreen
10:06 mantis1 joined #evergreen
10:21 mantis1 joined #evergreen

Results for 2023-12-19

16:06 Bmagic I always assumed that pingest did all the same things as saving the record in the marc edit interface
16:10 Dyrcona Bmagic: Depends. pingest runs the database functions. It may also be out of date for newer ingest requirements.
16:12 Bmagic there is some magic that the interface is doing
16:13 Dyrcona No. It's all done in the database. Updating the marc runs a trigger in the database. There may be new parts of that which pingest doesn't know about.
16:13 Bmagic I have rows for this record in metabib.title_field_entry, as well as reporter.materialized_simple_record
16:14 Bmagic so, it looks good from the database side, but the item status still doesn't show the title. It must be getting it from some other table
16:14 Dyrcona Well, the new OPAC, staff client, etc. don't always use the same fields.

Results for 2023-12-08

15:30 Dyrcona "They say these days are made of rust..." Or should that be Rust? Eh, berick?
15:33 Stompro I'm looking at how Aspen categorizes things as fiction/non-fiction... and it categorizes poetry as Fiction by default?  Which seems wrong for how Libraries usually categorize things.
15:33 Dyrcona Command line programming with PHP: I don't recommend it if you don't need to for some crazy reason... like oh, testing  the Aspen Evergreen driver without installing Aspen.
15:35 Dyrcona Stompro: Most of that comes from the MARC, I wager. Not sure if poetry can also say "fiction" in the coded values/wherever that lives, but I've seen lots of crazy stuff in MARC records over the years...decades.
15:36 Dyrcona Come to think of it, I don't recommend web programming with PHP, either.
15:36 Dyrcona :)
15:37 Dyrcona Poetry should probably be its own category though.
15:57 kmlussier I'm also not surprised jweston has published a book in every Dewey classification. She's very talented.
16:03 sleary I have THOUGHTS about the fact that MARC has a field for festschrift but not fiction/nonfiction.
16:03 * berick had to google
16:04 Dyrcona I have many thoughts about MARC....most of them... not good.
16:04 JBoyer I lack the energy to act on it but Dyrcona and berick's conversation above has me thinking about a Malware Radio parody of Wall of Voodoo's Mexican Radio, which I enjoyed quite a bit back in the day.
16:04 berick yeah...
16:04 Dyrcona "I'm a Mexican woa-oah radio."

Results for 2023-12-06

09:22 JBoyer ^ too real robot, too real.
10:00 Dyrcona 2 days 1 hour 4 minutes at this point. :)
10:06 Dyrcona using 68GB (roughly) of RAM on the DB server for this one long query now.
10:12 Dyrcona i should have disable the ingest. The only thing changing in the MARC is the 901.
10:53 sandbergja joined #evergreen
10:56 sandbergja gmcharlt++ # opensrf releases
11:02 briank joined #evergreen

Results for 2023-12-05

10:36 Dyrcona Actually, it's probably smple rec and some of the other triggers, too. I'm not sure I want to disable simple rec in a transaction. I'll have to investigate the triggers a bit more.
10:40 pinesol News from commits: LP1850473 (follow-up): Update DOM selector in nightwatch test <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=bddc37​2d27e4ac9298ef6d265534c3b14529b2a2>
10:41 Dyrcona No, I don't want to disable any triggers. It would lead to locking and possible dead locks. Setting session replication role is overkill since it disables all triggers.
10:45 Dyrcona Looks like the ingest triggers are getting fired, too, but they.... Oh wait. The marc changes....
10:53 Dyrcona Trying different batch sizes, with a limit on a subquery, It did 10 in 5 seconds 100 in 26 seconds and 1000 took 4 minutes 23 seconds and used a lot more memory.
10:53 * Dyrcona tries 500.
10:56 Dyrcona 1m40.676s...
14:01 csharp_ reingest began ~11:30 a.m. yesterday - a little over half done per the batch count
14:07 Dyrcona csharp_: My case was nearly catastrophic. Cascading deadlocks stemming from a process trying to delete a staff account with thousands of owned records. It was acquisitions account. The person doing the delete tried it at least 3 more times after the first timed out in the client.
14:10 Dyrcona This sort of thing used to happen so much in Horizon with Sybase, that I made a little GUI in Java to show me the deadlocked processes. From that I could identify the most likely culprit process. I could then click its row in the table, type Ctrl-k and that would send the Sybase equivalent of pg_cancel_backend. Every now and then, I consider adapting that to PostgreSQL and Evergreen.
14:13 Dyrcona Looks like someone was also loading MARC order records at the same time. I may have clobbered their backend process, not sure. I tried to only cancel the delete_usr queries.
14:23 Dyrcona Speaking of long-running processes my update of tcn_source on 530,105 bibs where it is '' has been running for 1day, 5 hours, and 27 minutes at this point.
14:25 Dyrcona Fortunately, that is on a test system, or I'd have probably had to cancel it for deadlocks.
14:29 smayo joined #evergreen

Results for 2023-12-01

12:04 pinesol Dyrcona: well, that's what you get for not being a shell script
12:04 Dyrcona pinesol: I sometimes wonder if I'm not just a Perl one-liner.
12:04 pinesol Dyrcona: Have you confirmed your ISBN SPIDs with your service provider?
12:06 Dyrcona So, I am having fun with malformed MARC rerords. They have empty 901$b, and it turns out that tcn_source in biblio.record_entry is null for these records. That isn't supposed to be because the column is defined as "not null default 'AUTOGEN'::text."
12:07 Dyrcona Now I have to figure out what, if anything, these records have in common and how they're getting a null in bre.tcn_source when that should not be possible.
12:08 jihpringle joined #evergreen
12:09 Christineb joined #evergreen

Results for 2023-11-17

09:27 Bmagic kmlussier++
09:44 Dyrcona So, I think I've found a "solution" to my MARC4J streams problem. I'm going to have two programs. The export program will write out the marcxml from Evergreen to a file and also write a file to map the database id with the record's position in the marcxml file.
09:44 Dyrcona The second program will read both files, parsing the marcxml with MARC4J and using the other file as a key to map the current record with the database id.
09:45 Dyrcona The first program doesn't have to be written in Java, and I suppose I could output a binary MARC file to same some space. MARCXML just seemed simpler, since I could just print the marc field directly.
10:03 sleary kmlussier does pinesol know about matcha frapps?
10:03 Dyrcona my $file_content = do{local(@ARGV,$/)=$filename;<>}; # That's a gnarly trick.
10:05 kmlussier sleary: It certainly can be added as a dessert. :)
15:52 Dyrcona Hmm... The permissive stream reader include binary in the exception output, looks like field separators or other low ASCII control characters.
15:54 Dyrcona Maybe this has not been worth the effort.
16:19 jihpringle joined #evergreen
16:44 Dyrcona I wonder, too, if sometimes the errors cascade. That is an error reading a previous record ends up messing up the read of the next few records even if they might be correct. There are things that will do that with MARC::Record.
16:51 jihpringle joined #evergreen
17:05 mmorgan1 left #evergreen
18:21 sandbergja joined #evergreen

Results for 2023-11-16

11:21 jeff oh, never mind. I think that's exactly what you said you were doing, and I misread.
11:24 Dyrcona Yeah. We're sending records to someone parsing them with MARC4J. I'm trying to implement a program to find records that MARC4J doesn't like and then output a spreadsheet of the errors for our catalogers.
11:24 Dyrcona I may go back to java.nio.Pipe. I had that sort of working, but when I added InputStream to the mix, the program would hang.
11:26 Dyrcona I know I'm getting I/O deadlock, and I'm using classes that are recommend for use with different threads in a single thread. Maybe if I spin off a thread for the MARC reader, but then I need to also get the database Id in that thread somehow.....
11:26 Dyrcona There's too much computer science in Java. It's definitely not a hacker's language.
11:27 kmlussier joined #evergreen
11:28 Dyrcona I am flushing the OutputStream before trying to read from the other end of the pipe.... Pipes in C are so much easier.  Well, maybe I'm more familiar with pipes in C. :)
12:12 Dyrcona That's at 11:24 EST. :)
12:13 Dyrcona I've already done this for a set of records that evergreen-universe-rs doesn't like.
12:13 berick oh, gotcha
12:15 Dyrcona I think I'll put this down for now and work on a program to convert some marcxml to binary marc to see if CPAN's RT will let me upload that. I get a 403 when I try to upload the marcxml examples from the Rust test.
12:15 Dyrcona "That should only take half an hour," he said knowing it was very likely to be a lie.
12:16 Dyrcona Also, mexican_coca-cola++ It tastes so much better with cane sugar than with HFCS.
12:22 Dyrcona What? libmarc-perl does not install MARC::File and friends? I thought that it did.
12:23 * Dyrcona grumbles about CPAN.....and that half hour will be spent just installing the tools.
12:33 Dyrcona marcdump [options] file(s) That's useful....
12:34 Dyrcona And, I have to write my own. marcdump doesn't work on XML.
12:35 Dyrcona I'm just full of complaints today, aren't I?
12:44 Bmagic you? never!
12:44 Dyrcona I'm installing MARC::File::XML with cpan set to local::lib, and there sure are a lot of prerequisites.
12:50 Dyrcona Failed 11/11 test programs. 3/5 subtests failed.
12:50 Dyrcona Right. I'll just run it on a server where this is already installed.
12:51 Dyrcona And, I'll wipe out the stuff that CPAN installed locally.
13:08 Dyrcona Looks like I may have to reboot. I just swapped monitors and the laptop doesn't see the new one.
13:14 Dyrcona joined #evergreen
13:16 Dyrcona hey! That's funny. MARC::Batch catches some of these errors: Leader must be 24 bytes long
13:19 Stompro Dyrcona, have you looked at MARC::Lint already?
13:33 Dyrcona Stompro: Never heard of it.
13:33 Dyrcona Apparently, all of the software in the world has chosen this week to hate me: https://rt.cpan.org/Ticket/Display.html?id=1503​48&amp;results=a5d68555ff4b4354e65ce6ec51f76634 # Read to the bottom...
13:37 Dyrcona gmcharlt: RT on CPAN is apparently broken for uploads at the moment. I've tried 3 times to add a file of records to that ticket above.
13:39 Dyrcona Stompro++ I'll give MARC::Lint a whirl.
13:57 Dyrcona Stompro: It looks like MARC::Lint may help. I'm running a test program already.
13:59 Stompro I wonder if it will be too verbose, or if you can pick out the bigger issues.  I'm curious how it performs also?
13:59 Dyrcona And, maybe not so much: is_valid_checksum: Didn't get object! at /usr/share/perl5/Business/ISBN.pm line 481, <DATA> line 244.
13:59 Dyrcona Well, it gets totally clobbered by our data after bib id 233519.
14:05 Dyrcona Dunno. That could be what it exploded on. I'm trying again with an eval BLOCK.
14:06 Dyrcona If it gets all the way through I'll use CSV, and output the warnings to a csv. I might output the errors to a separate one.
14:08 Dyrcona My catalogers will be sorry that they ever asked for this. :)
14:10 Dyrcona MARC::Lint seems to find something in nearly every record.
14:39 terranm joined #evergreen
14:54 Dyrcona hmm... What's the limit of rows in Excel, 32,000? I may have to split this up.
14:55 kmlussier1 joined #evergreen
15:03 * Dyrcona tries hot swapping monitors again. If I disappear, I had to reboot.
15:10 Dyrcona Well, looks like I have to reboot.
15:16 Dyrcona joined #evergreen
15:28 Dyrcona Looks like MARC::Lint uses its own eval, and the errors that it passes up to the client program are not very useful for a cataloger: "Can't locate object method ""checksum"" via package ""0316110620"" (perhaps you forgot to load ""0316110620""?) at /usr/share/perl5/Business/ISBN.pm line 484, <DATA> line 244."
15:32 Dyrcona Stompro: Do you know about Tk::MARC::Editor and MARC::ErrorChecks?
15:41 dluch joined #evergreen
15:45 jihpringle joined #evergreen
15:49 Stompro Dyrcona, nope, I haven't looked at them before.
15:50 Dyrcona I had a quick look at MARC::Errorchecks and it seems more cumbersome and nitpicky than MARC::Lint.
16:06 pinesol joined #evergreen
17:08 mmorgan left #evergreen
18:26 briank joined #evergreen

Results for 2023-11-15

09:56 sleary joined #evergreen
09:58 Dyrcona Ugh. I'm going to have to take the time to write a Perl program to convert these bad MARCXML records to binary. yaz-marcdump can't read 275 of them.
09:59 Bmagic marc--
10:04 Dyrcona Well. part of the problem is MARC::Record tries to hard to read bad records, and some legacy systems produced too many bad records.
10:11 Bmagic Dyrcona++ Dyrcona++ # Aspen woo
10:20 csharp_ @dunno add Automate ALL THE THINGS!
10:20 pinesol csharp_: The operation succeeded.  Dunno #83 added.

Results for 2023-11-07

12:01 Rogan joined #evergreen
12:10 jihpringle joined #evergreen
12:13 jblyberg7 joined #evergreen
12:21 Dyrcona Y'know what? I'm done with captchas. If a site asks me to solve a puzzle or some crap, I'm going away and not coming back. So, how is this relevant here? Well, it means no bug report on MARC::Records' RT. Well, let's see what happens if I sign in, first.
12:24 Dyrcona It wants you to solve a puzzle to logout of "guest." I know they've had issues with spam, but I'm so sick of "proving" that I'm human. (Also, turns out that bots are better at solving some of the puzzles than humans, so they consider failure to be success, too.)
12:31 pinesol News from commits: LP2039612: regression test for creating carousels <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=b418f2​dfa8b9ec16d132b74c25c240480d2b5ed0>
12:31 pinesol News from commits: LP2039612: Fix Carousel create / edit <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=00bd4c​88d60fedaf1a39a7d34a24f0c2e2bd2c00>
16:08 Dyrcona Hey! Wow! < 2 minutes with the cursor.
16:18 Dyrcona Hmm... Output is truncated. I think I have to actually close the Writer or File or something.
16:19 Rogan joined #evergreen
16:28 Dyrcona MARC4J doesn't seem to have any interface for taking a blob and turning it into a MARC record. it seems rather file and stream oriented.
16:32 Dyrcona That's enough for today. My eyes are tired.
17:10 mmorgan left #evergreen
17:32 pinesol News from commits: LP#2030820: scope line item alert types based on workstation OU <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=e2cb64​d73544da47cd95e522c221ad13d9856705>

Results for 2023-11-06

09:00 sleary joined #evergreen
09:00 smayo joined #evergreen
09:02 mmorgan1 joined #evergreen
09:10 Dyrcona Stompro: It is 1,736,893 bibs. I can't really count the items because the main file is MARC binary.
09:11 Dyrcona BTW: I think the Perl MARC::Record is too permissive. I'm going to do some research on it and probably open a ticket in CPAN's RT.
09:12 Dyrcona We have a record with and empty 008 as an example. Evergreen deals with it just fine, but other systems mangle it.
09:24 Dyrcona The simple things seem to be a problem for me this morning, like opening files in Perl.... I know what the problem is... No music is playing!
09:25 mantis1 joined #evergreen
14:00 terranm Weekly 3.12 code review if anyone wants to join - https://www.google.com/url?q=https​://princeton.zoom.us/my/sandbergja
14:13 ejk_ joined #evergreen
15:21 jeff terranm++ for sharing the link here
15:36 Dyrcona So, I'm thinking of implement a third MARC exporter for Evergreen. This one would be written in Java using MARC4J to catch things that Aspen won't like. It's not so much meant to be for every day use.
15:54 Dyrcona Looking through the code I wrote the MVLC migration to Evergreen from Horizon has given me some ideas. I had a SAX parser in there to check the MARCXML before trying to load a record. It would delete "empty" subfields and control fields. I could just yank the MARCXML from the database and check for busted elements as a start.
16:07 csharp_ jeffdavis++ # PG 14
16:59 mantis1 left #evergreen

Results for 2023-11-02

10:08 Dyrcona I'm not entirely sure how vis_attr_vector is used, but I'm sure it's most important for records with URIs.
10:13 sleary sandbergja: thank you for changing that route :)
10:14 Bmagic update config.internal_flag set enabled='f' where name~'ingest.reingest.force_on_same_marc'
10:15 Dyrcona Bmagic: Yeah. You may or may not want that flag enabled all the time. It depends on how much/how often your MARC actually changes.
10:16 Dyrcona You might be surprised to see how often MARC gets updated when it hasn't changed.
10:16 Bmagic I think that was the trick, still playing with it
10:18 Bmagic confirmed, that was it
10:18 Bmagic I could have swore I checked that before I started posting here
10:56 Dyrcona Awesome sauce! The "obvious" approach works.
10:57 Dyrcona `psql -v outputdir=output -f script`, then in the script: `\o :outputdir/outfile.dat`.
10:58 Dyrcona Think I'll shorten it to 'outdir' for the actual thing, though.
11:24 Dyrcona Bmagic (and berick for that matter): I'm going to run the Rust MARC exporter either later today or tonight to capture the error output. It seems to find more "bad" records than the Perl code. Just thought I'd give you a head's up. I don't expect the load to spike on the utility server, but you never know.
11:25 Bmagic Dyrcona++ # you go on with your bad self
11:25 berick Dyrcona: cool, be curious to see what you find
11:26 Dyrcona Actually, I'll schedule it for 9:00 PM since we don't seem to have any database updates requested. I'll write something to parse the error output afterward. (It will be good practice.)
11:26 berick also what Bmagic said
11:26 Dyrcona :)
11:27 Dyrcona I'm going to extract all of the records, and I won't bother with holdings.
11:28 Dyrcona berick: Can the eg-marc-export do authorities, too? If not, that would be a useful feature.
11:28 berick Dyrcona: not yet
11:28 Dyrcona I'll work on a pull request, then. ;)
11:28 berick awesome

Results for 2023-10-30

10:43 BAMkubasa joined #evergreen
10:50 mantis1 joined #evergreen
10:55 briank joined #evergreen
11:15 Dyrcona OK. I have the latest Rust marc-export test running. I'll see how long this takes.
11:27 jihpringle joined #evergreen
12:14 Christineb joined #evergreen
12:18 eeevil berick: drive by question (from immediately-post hack-a-way) incoming! meetings will probably keep me away from here for the rest of the day, but want to get it to you asap for RediSRF(?) goodness

Results for 2023-10-27

10:19 Stompro I figured, my perl array skills need work. :-)
10:20 Dyrcona Maybe my suggestion requires more rearrangement of the code, though. Having a firsttag flag might fit better with the current code organization.
10:20 Dyrcona I wonder if the first one even needs to be grouped?
10:20 Dyrcona I'm going to look at MARC::Record again.
10:21 Dyrcona Stompro++ # For the notes in the snippets.
10:21 Stompro In my test data, the 901 tag would be placed before the 852 without using the insert_grouped_field for the first.
10:23 Stompro I don't think MARC::Record re-orders the fields.
12:38 Dyrcona Heh. Almost 1 minute longer.....
12:50 collum joined #evergreen
13:02 Dyrcona I am testing this now: time marc_export --all -e UTF-8 --items > all.mrc 2>all.err
13:14 Dyrcona The Rust marc export does batching of the queries by adding limit and offset. I wonder if we should do the same? I've noticed that the CPU usage goes up over time, which implies that something is struggling through the records. The memory use stays mostly constant once all of the records are retrieved from the database.
13:20 Stompro Dyrcona, if you use gnu time, it gets you max memory usage also.  /usr/bin/time -v... so you don't have to check that separately.
13:25 Stompro Dyrcona, I'm surprised the execution time increased for you... hmm.
13:28 Dyrcona Things are always weird here.
16:13 berick Stompro++ eeevil++ # looks like cursors are an option -- will give it a poke
16:17 Dyrcona I'll give cursors a poke, too.
16:17 Dyrcona I think I commented about "cursors and Sybase" and my early experience with them at The Jockey Club and with Horizon last week.
16:19 Dyrcona BTW, my maximum memory usage is 9GB. I think that's my biggest issue with MARC export.
16:19 eeevil rewindable and writable cursors, and with-hold cursors, are not as fast as not-those-types in PG, but we don't need those, generally.
16:19 Stompro With --items, 1.3G vs 256M for max resident memory, 596s vs 473s run time.  (That as compares the 852 insert changes).
16:19 Stompro s/as/also/

Results for 2023-10-25

09:18 kmlussier sleary: I hope you enjoyed your first hack-a-way!
09:21 Dyrcona So, for those following the marc_export saga: I'm logging the time spent in subroutines in microsecond granularity using Time::HiRes. I hope this points to where the trouble is. I'm also only exporting 1 library's items. They have about 145,000 records so it should finish in a "reasonable" amount of time.
09:26 redavis joined #evergreen
09:27 Dyrcona berick: I'm going to do some more research on Rust and look into adding a debug option to the Rust marc-esport.rs. I'd like to log the queries it runs, because it looks like the main query ran for 5 secs each time it was called on my produciton db the other day. I want to get the whole thing to run through explain analyze. It gets truncated in the postgres logs.
09:37 Dyrcona Rogan left. I wanted to ask him about some queries he wrote, i.e. why he used subqueries instead of join in some places. Is it because the subqueries were faster or was that just how it came to him? I have seen subqueries be faster than joins in some cases, but not always.
09:38 Dyrcona I suppose I could test it on my data and adjust if appropriate. It often depends on indexes and number of rows.
09:49 terranm67 joined #evergreen

Results for 2023-10-20

09:01 mantis1 This season has been terrible
09:02 mantis1 I hope you'll be ok for Hackaway!
09:06 Dyrcona joined #evergreen
09:08 Dyrcona berick: I did `cargo build --release --package evergreen` then copied eg-marc-export to /openils/bin/. I missed the password on of the two lines for eg-marc-export in my script, so I don't know if it is faster, but the binary is certainly smaller without the debugging symbols, etc.
09:14 redavis joined #evergreen
09:18 terranm joined #evergreen
09:18 Dyrcona FWIW, I haven't used --release on my test system. I did that for the production server.
10:10 csharp_ sounds good too
10:21 Dyrcona Hmm. One of our marc_exports is still running since Tuesday night.
10:21 Dyrcona I wonder if Perl has some kind of issue on virtual machines?
10:22 Dyrcona Well, I can always replace it with eg-marc-export.
10:32 berick Dyrcona: fwiw, the --release build chopped off 1/3 of the runtime for my 150k record+items export.
10:32 berick depends on the data, i'm sure, though
10:34 * JBoyer wonders what Kuma's Korner does for birthdays?

Results for 2023-10-19

10:28 mantis1 was missing one of those thank yo!
10:32 jeff recall holds are a thing that requires some additional A/T setup. I don't know if the defaults are suitable / functional out of the box. Org unit settings likely also.
10:33 jeff I was actually just wondering if it made sense to have an option to hide recall as a hold option, for those reasons and more. :-)
10:39 Dyrcona berick: My test of the Rust marc export finish in 9 hours 23 minutes, and exported a 3.4GB binary file with 1,737,349 records. There are 411 records in the error output. That looks good to me, compared to what I'm getting from Perl.
10:46 csharp_ rs++
10:57 Dyrcona csharp_: Do you think the marc_export is slow with the --items option?
10:57 briank joined #evergreen

Results for 2023-10-16

14:30 Dyrcona berick: Speaking of Rust... I think you might have introduced a bug with a recent commit when you moved where OFFSET gets addded. I got the following when using --query-file:
14:31 JBoyer Ah, still catching up here and there. Something is still bonkers with that extract though given what we're seeing here.
14:32 Dyrcona thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: Some(Error),  code: SqlState(E42601), message: "syntax error at or near \"OFFSET\"", detail:  None, hint: None, position: Some(Original(630)), where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("scan.l"), line: Some(1123), routi
14:32 Dyrcona nner_yyerror") }) }', evergreen/src/bin/marc-export.rs:398:56
14:33 Dyrcona JBoyer: --items has always been slow for us, but it's worse than ever, and it really looks like it is Perl.
14:34 berick Dyrcona: k.  mind sharing your query file?
14:35 Dyrcona berick: I'm modifying one of them. Let me try another run.
14:35 berick Dyrcona: i think i see the issue..
14:37 Dyrcona One of the queries returns 3 columns: bre.id, bre.marc, and count(acp.id). It didn't loook like the 3rd column would be an issue.
14:38 Dyrcona JBoyer: I would not be surprised if there is something wrong with the Perl versions I'm using or something, but I don't feel like I have time to deal with that. I'm under pressure to get them records last week. :)
14:40 berick the chunked processing requires ordering/limiting/offseting which adds additional restraints to the format of the query file.  for now, could just read query file as-is and avoid any paging.
14:41 JBoyer True, finding the Right Fix when you're under a Right Now deadline is a lot like being technically correct but completely unhelpful. :) I just think that *after* you can get the initial export done and transported that replacing the exporter won't necessarily be the ideal fix. (Though, for later, I'm also curious what os the super slow exporter is running on)
14:51 jihpringle joined #evergreen
14:52 berick Dyrcona: pushed a patch to avoid modification of the --query-file sql
14:52 berick re: id, any chance you have multiple columns resolving to the name "id"?
14:53 Dyrcona berick: Cool. I might.... I'm going to modify the query that I think is blowing up to use a CTE, and then grab bre.id and bre.marc.
14:53 Dyrcona Currently, it's actually returning acn.record, bre.marc, and the count on acp.id.
14:56 Dyrcona Dude..... I just noticed the file ends with two semicolons.....I'll bet that's it.
14:56 Dyrcona Still, I think I'll do the CTE.
14:57 berick my test file contain: SELECT id, marc FROM biblio.record_entry WHERE NOT deleted  (no semicolons needed)
14:59 Dyrcona Yeah, ; is a habit from writing stuff for psql.
15:07 pinesol News from commits: LP2035287: Update selectors in e2e tests <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=f562b3​ac30a3753d63d565c2d7be4d3a7121a2fb>
15:11 Dyrcona Now, I'm cooking with gas. I got the "large" bibs file (85 records with 87,296 items) dumped to XML in under a minute. That took almost 2 hours with the Perl program the other day.
15:18 Dyrcona Using  query-file to feed the eg-marc-export, it is using a lot more RAM than before, about the same as the Perl export was using. It still uses less CPU. We'll see if that changes over time.
15:28 jeff you have 87,296 items that are spread across only 85 bibs?
15:29 Dyrcona Yes, we do.
15:29 jeff color me intrigued.

Results for 2023-10-13

13:20 jeffdavis yes, fairly frequently
13:28 pinesol News from commits: LP#2007603: restore functioning of default search tab preference <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=adf3bd​07e0e4558e9d48f9cdf5081e956eef7866>
13:55 jihpringle joined #evergreen
13:59 Dyrcona berick: eg-marc-export appears to be way more efficient than marc_export. However, the XML it produces isn't pretty printed, so I'll have to split records with sed or something to see what I've got when it is done.
14:06 berick Dyrcona: ah, i've been piping to xmllint --format
14:06 berick but could easily add a --pretty-print option
14:12 berick pushed --pretty-print-xml option.  added some other options since yesterday as well.
14:13 mantis1 Is recall and force holds a library setting?
14:18 berick mantis1: when placing an item-level hold in the staff catalog, the options should be available.
14:18 berick they do have their own permissions
14:18 Dyrcona berick: I could have just dumped a binary marc file. I've got a thing to count records for that.
14:19 Dyrcona I pasted your example from yesterday and changed the filename.
14:20 jihpringle mantis1: depending on what you're doing there are three library settings that control how recalls work
14:21 mantis1 berick: it might be the permissions then that I need to assign
14:21 mantis1 berick++
14:22 mmorgan mantis1: I think those hold options are only available under Request Item in Item Status.
14:22 berick mmorgan: they're in the Ang staff catalog
14:22 Dyrcona berick is there a way to just build eg-marc-export?
14:23 mmorgan Oh! I should have known that!
14:23 berick Dyrcona: sorta, not really.  you can build --package evergreen, but it of course builds its opensrf depenencies
14:24 Dyrcona OK. Thanks!
14:58 berick looking at some other options too
15:01 Dyrcona I should have time'd this run... I'll  do that next time.
15:02 mantis1 left #evergreen
15:24 Dyrcona berick: It doesn't look like eg-marc-export can be fed a list of ids in a pipe.
15:24 Dyrcona That was more a question, really.
15:28 Dyrcona It finds lots of errors, too.
15:32 berick Dyrcona: no, that's not yet supported
15:34 Dyrcona I'm going to start over with some different options and time it.
15:34 berick k.  --query-file may need some work too.  it wants 'id' and 'marc' columns in the query, which may vary from the marc_export version.
15:35 * berick looks at --pipe
15:35 Dyrcona Yeah, but I could use a modified version of the SQL to get the list of ids, just add the marc column.
15:35 berick yeah
15:35 Dyrcona marc_export just takes ids on stdin.
15:36 Dyrcona Think I'll just dump all records with --items to a binary file, time it, and see what I get and how long it takes. I'll dump stderr to a file to see if I can fix some of these records. I see a bit of "bad subfield code."

Results for 2023-10-12

11:09 Dyrcona ps -o etime,pcpu,rss,pmem 110013
11:09 Dyrcona ELAPSED %CPU   RSS %MEM
11:09 Dyrcona 23:29:58 97.2 9550376 29.0
11:09 Dyrcona 360853 <- Number of records in the binary MARC file.
11:10 Dyrcona /openils/bin/marc_export --all --items <-- command line
11:12 Dyrcona I wonder what our average copy count per bib is? Probably not that high, though we do 1 bib with over 4,000.
11:13 Dyrcona I estimate it will export about 1.78 million records.

Results for 2023-10-11

11:02 Dyrcona We will also likely never use it in produciton....
11:05 Dyrcona It might need more patches than just that one.... I'll leave it for now.
11:14 kmlussier joined #evergreen
11:18 Dyrcona So, going back to yesterday's conversation about MARC export, I wonder if that commit really was the problem. I reverted that one and two others, then started a new export. It has been running for almost 21 hours and only exported about 340,000 records. I estimate it should export about 1.7 million.
11:20 Dyrcona At that rate, it will still take roughly 5 days to export them all. This is one a test system, but it's an old production database server and it's "configured." The hardware is no slouch. I guess I will have to dump queries and run them through EXPLAIN.
11:28 Dyrcona Y'know what. I think I'll stop this export, back out the entire feature and go again.
11:29 jeff if it's similar behavior as yesterday and most of the resource usage appears to be marc_export using CPU, I'd suspect inefficiency in the MARC record manipulation or in dealing with the relatively large amount of data in memory from the use of fetchall_ on such a large dataset.

Results for 2023-10-10

10:27 Dyrcona `ps -o etime 24586` said 4-19:00:01 just a few seconds ago.
10:28 Dyrcona I'm running it with time, but was curious how long it has been going so far.
10:29 Dyrcona Adding --items seems to really slow it down on this setup.
10:33 Dyrcona I should try it with a binary MARC file to see if that makes a difference. I wonder if writing the output locally is a problem, though I doubt it.
10:36 Dyrcona The db server does not appear to be under any strain.
10:37 Dyrcona Load is 0.08 and plenty of free RAM, which could mean its not cached, but with NVMe, who needs cache? ;)
10:44 sandbergja joined #evergreen
10:44 Dyrcona Makes me wonder if we're missing an index, or if adding a new index might help. It would be nice if there was an easy way to dump the SQL from Perl DBI... Maybe there is. I should check.
10:48 Dyrcona I suppose I could hack a copy of marc_export to dump the SQL instead of executing it.
10:50 briank joined #evergreen
10:50 Dyrcona I'd like to run it through explain. It's probably the queries to grab item information to add to the MARC, so I'll have to dump an example of that, too.
10:52 Dyrcona Guess I will be looking into it later.... *sigh*
10:52 jeff or tweak log_min_duration_statement just long enough to capture some sample queries. depends on how otherwise loaded your db server is, if this is prod.
10:56 Dyrcona This is a test system that hosts multiple databases, but this is the only instance currently doing anything.
11:03 Dyrcona It's not running on the same server as the DB either.
11:06 Dyrcona I'll have to do some investigation to see where the problem lies. Maybe I can get some improvements for everyone out of this.
11:20 Dyrcona jeff++ # I may just crank the logging up for a test run later. I suspect this one will finish sometime later today, but I also thought that it would have done by yesterday to start with.
11:24 Dyrcona FWIW, I'm dumping XML because it's "easier" to work with than binary MARC, but when a file is about 8GB in size, the format doesn't really matter any longer, does it? :)
11:25 collum joined #evergreen
11:33 kmlussier joined #evergreen
11:39 jihpringle joined #evergreen
13:41 Dyrcona jeff: I think some of the patches that I am testing are responsible for the slow down, particularly the one for the above Lp bug.
13:45 Dyrcona I think I'll revert a couple of commits before I say much more.
14:21 Dyrcona Hmm... Looks like I have somewhere in the vicinity of 400,000 records left to export. I think I'll stop this one and try again with the suspected commits reverted.
14:25 Dyrcona Think I'll export to a binary MARC file this time. At least the file will be smaller.
14:43 mdriscoll joined #evergreen
14:50 shulabear joined #evergreen
14:50 Stompro joined #evergreen

Results for 2023-09-29

12:25 kmlussier Dhruv_Fumtiwala: No, they are stored in action.circulation
12:26 kmlussier The aged_circulation table is where those transactions go if you set up the process to remove patron information from them.
12:34 collum joined #evergreen
13:32 Dyrcona Nothing like doing a marc_export to find bad bib records: Warning from bibliographic record 313383: Argument "I65" isn't numeric in integer division (/) at /usr/share/perl5/MARC/Record.pm line 407.
13:41 Dyrcona Hmm. Maybe I should not have run this test with our largest member library.... It's taking a while to produce the output. :)
13:42 sleary joined #evergreen
14:06 Dhruv_Fumtiwala joined #evergreen

Results for 2023-09-19

12:53 mantis1 joined #evergreen
13:35 abneiman sandbergja++ # adoc assistance
13:35 abneiman Bmagic: I pushed a followup, fingers crossed for a clean build tonight!
13:49 Dyrcona JBoyer: Following up on a private conversation from last week: Do you know if EOLI has any marc_export patches dealing with record size? If not, I'm curious how you handle the "export oversize MARC records in XML" for Aspen.
13:58 pinesol News from commits: Docs: followup commits to Reports docs <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=e7b4f2​d7d4479f1f409a4baed68a5edc9541f9ae>
14:51 * Dyrcona applies all the patches to marc_export.
15:30 mantis1 left #evergreen

Results for 2023-09-01

12:02 Dyrcona jeff: S'allright.
12:02 Dyrcona Sometimes the answer is just upgrade. :)
12:03 Dyrcona jeff: That's interesting what you point out. I would have expected an error that bar has no column 'id.'
12:08 Dyrcona I suppose I could try and figure out what data from MARC is being used to build the wide_display_entry title and physical_description fields, or I could just take her word that she want the 300$n, and use 245$a for the title. :)
12:09 Dyrcona My suspicion is this query will be faster if it drops the join on metabib.wide_display_entry and just grabs the data from MARC via XPath, since biblio.record_entry is already joined.
12:09 Dyrcona The original is still running against my test database.
12:10 jeff more detailed example that I just created: https://www.db-fiddle.com/​f/9YseNbGnFVuqPkVJK85aew/0
12:16 jeff fun when "foo" is something like actor.usr or biblio.record_entry and "bar" is records_to_update or records_to_delete or the like. good reason to qualify your column references even when not forced to by ERROR:  column reference "id" is ambiguous

Results for 2023-03-31

12:58 jvwoolf Dyrcona: I updated biblio.record_entry in order to generate the new fingerprints. Do I have to do it again for the metarecords?
12:59 jvwoolf I think you, JBoyer and I had this conversation before, and what I came away with seemed to work in my first test, but not the most recent few
13:03 Dyrcona jvwoolf: There's a flag... let me look it up.
13:04 Dyrcona ingest.reingest.force_on_same_marc <- Needs to be true in config.internal_flag if the MARC didn't change, I think.
13:13 jvwoolf Dyrcona: This is a global flag?
13:14 jvwoolf I don't seem to have that in the 3.9.2 database
13:16 Dyrcona jvwoolf: It should show up in config.global_flag and config.internal_flag.

Results for 2023-03-15

09:50 kworstell-isl joined #evergreen
10:12 Christineb joined #evergreen
12:07 jihpringle joined #evergreen
12:10 Dyrcona Binary MARC records with HTML entities in them.... I guess.... Whatever.....
12:27 berick @decide binary-marc-with-html OR html-with-binary-marc
12:27 pinesol berick: That's a tough one...
12:57 jeff &2DzfVQ- is the IMAP4 modified UTF-7 (mUTF-7) encoding for U+1F355, aka the "pizza" emoji: 🍕
12:59 * jeffdavis backs away slowly
13:02 Dyrcona Does that pizza emoji have pineapple on it?
13:04 Dyrcona So, chardet3 is my new friend. It tells me UTF-8 encoded MARC files are UTF-8 with 0.99 confidence, and MARC-8 encoded MARC files are ISO-8859-1 with 0.70 to 0.75 confidence.
13:04 Dyrcona chardet3 does not know about MARC-8.
13:05 * Dyrcona looks for a similar module to Python's chardet in Perl.
13:07 Dyrcona libencode-detect-perl is packaged for Ubuntu/Debian.
13:12 Dyrcona It turns out that libraries can choose UTF-8 when downloading records from Overdrive. If they don't then the records are apparently MARC-8. I don't want to have to figure that out manually, so I'm going to make my record load program do that for me.
13:15 Dyrcona And Encode::Detect won't do what I want, since it decodes the text using the detect encoding. That will break MARC-8.
13:15 Dyrcona I feel like I've had this monologue before....
13:26 Dyrcona Aha! ascii if there are no "fancy" characters for MARC-8.
13:35 rfrasur joined #evergreen
14:26 Dyrcona 1 file changed, 39 insertions(+), 7 deletions(-) # Hopefully that does it!
14:43 scottangel I'm looking at the 'strict barcodes' checkbox on the patron checkout page. Looks like it's bound to a variable called $scope.strict_barcode. The problem I'm facing is the function ng-change="onStrictBarcodeChange()" doesn't flip the boolean. Am I missing something? shouldn't there be something like $scope.strict_barcode = !$scope.strict_barcode; From what I can tell is this setting is meant to be saved w/ egCore.hatch.setItem() but
14:52 Dyrcona Good question. I don't know.
15:13 Dyrcona Meh... I should spell check commit messages before pushing....
16:01 BDorsey joined #evergreen
16:37 Dyrcona Whee! MARC-8 encoded record, says it's UTF-8 in the leader but \xE1\x65 is MARC-8 for \xC3\xA8 in UTF-8.
16:40 Dyrcona NB: I haven't done anything to the file other than inspect with my editor.
16:40 Dyrcona Grr..... Omitted a word there.
16:45 Dyrcona Also....Editing text in browser text boxes stinks....

Results for 2023-03-13

09:46 dguarrac joined #evergreen
10:25 Christineb joined #evergreen
10:54 Dyrcona Hm.. I wonder how workable it would be to replace ISO-8859-1 copyright symbols with UTF-8 ones using a regex.... It seems like it would be simple, but it's the kind of thing that can lead to problems. I have a file of records that I can play with....
10:58 Dyrcona Looks like it happens with registered trademark symbol, too. (Gotta love vendor-supplied MARC records.)
11:06 Bmagic Love em indeed
11:16 Dyrcona Bmagic: Does your load process handle things like that? I have a --strict option on one of my load programs that rejects records with bad characters, well any warnings are treated as errors, really.
11:16 Dyrcona I'm considering adding code to fix copyright and registered trademark symbols since they seem to be a thing with this one vendor in particular.
11:21 Bmagic Dyrcona: but FWIW: https://github.com/mcoia/sierra_marc_tools​/blob/master/auto_rec_load/dataHandler.pm around line 800, if it dies, it will failover to readMARCFileRaw
11:22 Dyrcona Bmagic: Thanks. I've had a glance at that code before. My issue isn't reading the records. We get these warnings when loading them: utf8 "\xA9" does not map to Unicode at /usr/lib/x86_64-linux-gnu/perl/5.26/Encode.pm line 212, <GEN1> chunk 300.
11:23 Dyrcona They will load in the database, if I let them go in.
11:24 Dyrcona The warnings don't occur while prepocessing the records with using MARC::Record to modify the 856 tags.
11:24 Bmagic writing something to transcode one character to another seems doable. But I've not done it. Sorry :(
11:25 Dyrcona Yeah, it's actually transcoding to 2 characters \xA9 -> \xC2\xA9.
11:25 Bmagic it sounds like you'll end up having to read the file character by character instead of letting MARC::Record do it?

Results for 2023-02-17

12:10 jihpringle joined #evergreen
12:24 collum joined #evergreen
12:27 Bmagic Dyrcona++
15:08 Dyrcona Does anyone update MARC directly in the database, maybe using regex replace or something? I usually write a Perl program using MARC::Record. I'm thinking of trying one using a regex substring replace directly in the database.
15:08 Dyrcona Three o'clock on a Friday is probably the wrong time to ask.
15:24 Dyrcona The proposed change will make the records shorter by a few bytes. I should use a Perl program with MARC::Record.
15:30 mmorgan Dyrcona: I've often wondered if that would be a good idea for certain projects, though I can't think of specific ones at three o'clock on a Friday (before a long weekend) ;-)
15:32 Dyrcona mmorgan: I recall receiving the suggestion to add a database Perl function, but I'm not fond of that for one shot updates.
17:05 mmorgan left #evergreen

Results for 2023-01-24

08:28 mantis1 joined #evergreen
08:34 mmorgan joined #evergreen
09:14 Dyrcona joined #evergreen
09:18 Dyrcona So, going back to my UPC problem from yesterday... If the goal was to make something work for all standard numbers, this would have been a better xpath expression: /marc:datafield[@tag='024'​]/marc:subfield[@code='a' or @code='z']
09:19 Dyrcona Doing the 'or' on all of the documented indicators was kinda dumb, and introduced the bug.
09:36 Dyrcona So, I don't think this bug was noticed until display fields started being used in the OPAC/staff client.
10:01 Dyrcona So, I think what I'm trying to do now could benefit from indexes being added to the field column on metabib.{keyword,identifier}_field_entry tables. Think I'll do that and drop the indexes when I'm done.

Results for 2023-01-23

16:02 Dyrcona I wonder if people know what they're talking about? The example record that I'm given was last updated in 2021 according to the dev database.
16:05 mmorgan Dyrcona: What is your xpath for config.metabib_field.id = 20?
16:06 Dyrcona I think I pasted it earlier.
16:07 Dyrcona Guess I didn't. /marc:datafield[@tag='024' and @ind1='1' or @ind1='2' or @ind1='3' or @ind1='4' or @ind1='5' or @ind1='6' or @ind1='7' or @ind1='8']/marc:subfield[@code='a' or @code='z']
16:09 jeff that looks... non-stock.
16:10 jeff and i worry a bit (without refreshing my xpath syntax) about the AND OR OR... without parens.
16:10 Dyrcona It is non-stock, but it is correct.

Results for 2023-01-19

13:02 miker jeff: re LP 1829295, without wading into the bug itself, a big +1 to a YAOUS for respecting closed dates (and, you can just delete the row from config.org_setting_type, probably through the UI as the admin user!)
13:02 pinesol Launchpad bug 1829295 in Evergreen "Shelf expire date doesn't respect closed dates" [Wishlist,Confirmed] https://launchpad.net/bugs/1829295
13:15 jeff miker: thanks for the feedback! i may have only half-followed you, though. which org unit setting type are you referring to?
13:55 Dyrcona "Smart" quotes in MARC.....That seems to be what's causing problems with record sizes.
13:55 mantis1 joined #evergreen
13:56 rhamby smart_quotes--
13:56 sleary ugh
14:03 Dyrcona sleary++
14:03 Dyrcona I'm leaning towards Windows and "copy and paste" cataloging. I was just converting from octal to see what the value is to look it up in UTF-8.
14:05 sleary Quotes and apostrophes copied from Word in Windows used to truncate content in WordPress constantly. Good times.
14:06 Dyrcona They truncate MARC records, too, because one of the characters in the sequence is the MARC End of Record character. I specifically use code to look for End of Field followed by End of Record to avoid this.
14:07 Dyrcona Looks like our Perl MARC code doesn't calculate a proper record length, but those characters shouldn't be in a MARC record in the first place.
14:07 rhamby utf16--
14:08 Dyrcona Heh. If only everything was big endian UTF-32....drive manufacturers would be happy.... :)
14:16 mmorgan @quote get 232
14:16 pinesol mmorgan: Quote #232: "<mmorgan> Smart quotes are kinda like smart TVs in that neither are all that smart" (added by Dyrcona at 04:47 PM, October 19, 2022)
14:17 Dyrcona mmorgan++
14:18 Dyrcona There's a bug in the CPAN RT for MARC::Batch or MARC::Record (maybe on Github, too), and I don't think the solution actually works.
14:23 Dyrcona I thought tsbere had a proposed solution this one: https://rt.cpan.org/Public​/Bug/Display.html?id=70169
14:28 jihpringle joined #evergreen
14:32 Dyrcona I have been told that this can happen if people copy and paste from Amazon when cataloging.
15:19 Dyrcona Yeahp. GNU Emacs also says the file of bad records is UTF-16 when I open it.
15:20 * Dyrcona wonders if pinesol has any dry kona in the coffee database.
15:24 sleary joined #evergreen
15:26 Dyrcona It seems odd to me that a program using MARC::Record->new_from_usmarc can read these records, modify them, and write them out without issue, but another program using the same Perl module blows up. I suspect the writing out leads to a bad length in the LDR.
15:32 Dyrcona oh!
15:33 Dyrcona The original records display just fine in GNU Emacs..... They get mangled going through Perl.
15:34 Dyrcona I see the curly apostrophes and quotes, and Emacs says the coding system is multi-byte UTF-8. Something has a double encoding problem with these characters.
15:35 Dyrcona I wonder if I'll even be able to fix this in a reasonable manner?
15:36 Dyrcona Perl's Unicode support is so broken....
15:42 Dyrcona See... this is what I dislike about Unicode in Perl (particularly with MARC): one time I encode/decode the records and it works. Next time, the records come out garbled. Mebbe I should reread the Unicode FAQ and double check the MARC code to know what's really going on here.
16:33 pinesol News from commits: Docs: global flags docs fixes <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=3f7b48​566d3f34e07c5b7ba5b27ed23d97abd4b4>
16:44 jvwoolf left #evergreen
17:00 mmorgan left #evergreen

Results for 2022-12-09

14:52 Bmagic mschell: you might be able to get away with a little regular expression in the OPAC to tease out the pieces that you need
15:02 Dyrcona miker | csharp_ : bug 1999274   I didn't link miker's branch because I thought you might like to add the bug # to the commit message.
15:02 pinesol Launchpad bug 1999274 in Evergreen "Performance of Search on PostgreSQL Versions 12+" [Medium,Confirmed] https://launchpad.net/bugs/1999274
15:08 Dyrcona mschell: You could use the 245 $c if you prefer, just alter the XPath expression. I'd like to add that some think that using MARC in the OPAC is a mistake and a future version of the OPAC may switch to using display fields.
15:08 sleary joined #evergreen
15:16 mschell Bmagic, thanks I'll try that.
15:17 mschell Dyrcona, that is a future version I look forward too :)

Results for 2022-11-28

08:39 mmorgan joined #evergreen
09:06 Dyrcona joined #evergreen
09:32 Dyrcona Is there an Angular MARCEdit component or does the client still use the AngularJS interface?
09:33 Dyrcona Ha! There is literally "marc-edit.component.{html,ts}" in eg2/src/app/cat/authority.
09:37 Dyrcona What I'm really looking for is in Open-ILS/src/eg2/src/app/staff/sh​are/marc-edit/marc-edit.module.ts
09:39 mmorgan Dyrcona: There's a MARC Edit option in the angular staff catalog from the full record.
09:49 Dyrcona mmorgan: Thanks! I found the files that I was looking for.
09:51 Dyrcona CWMARS has a customization to make "Local System" the default source for new bib records. I'm looking at that for the Angular editor. I'm considering adding YAOUS and putting this on Lp, but first, I'm trying to determine if it would be necessary.

Results for 2022-10-28

11:47 berick updating ticket now
11:47 gmcharlt ah, OK.
11:56 Dyrcona miker: The duplicates may have come from the orphan ingest catching up with deletes, and I loaded a file that was meant to replace existing records, and yes, there very well could be duplicates in the input file. My program won't insert duplicate records or duplicate URIs. It would update the previously inserted record, however.
12:02 Dyrcona By "deletes" I actually mean updates. I've got a script to remove located URIs from the MARC. That gets the MARC into a MARC::Record, removes the matching 856, then updates the bre in the database.
12:02 Dyrcona Probably got caught up with an incoming record. This probably would not happen under normal circumstances.
12:13 Dyrcona miker: Yeah, that's what happened. I've confirmed that all of those records had a URI removed and then later a new one was added.
12:18 dmoore Howdy all, I'm new to Evergreen and will be hanging around for a bit as I learn it. Coming from an Alma/Primo setup, so I'm glad to be part of an open source community now

Results for 2022-10-24

13:12 miker Dyrcona: boo, lame. did anything show up in the PG logs? (and, I assume, this run includes the recent updates)
13:18 Dyrcona miker: I'm not finding anything.
13:52 Bmagic has anyone had issues with report templates not working post-3.9 upgrade? But only certain ones. Seems to have to do with shelving locations. Same for item templates?
13:54 Dyrcona I swear there was a bug for MARC::Record not calculating record lengths correctly with multibyte sequences. I also swear that was fixed, but I think I'm seeing it. I can't find a bug, however.
14:01 jihpringle Bmagic: we upgraded to 3.9 in August and we haven't had any reports of template issues
14:02 Bmagic jihpringle++
14:02 jihpringle I think the only existing templates that we re-did for the upgrade were the ones related to patron notes, alerts, blocks

Results for 2022-09-27

11:18 BAMkubasa thanks!
11:19 berick hm, our main metadata schema is MARC.  MODS is used in some places for extracting specific bits of information.
11:20 BAMkubasa ok
11:20 * Dyrcona assumed MARC is one of the main data schemes, not metadata, but suppose we could argue about that.
11:22 Dyrcona But, we do use MODS for indexing and a good bit of display in the OPAC. It would be easier to add a new schema if you can represent it with XSLT.
11:22 BAMkubasa so, xpath is a tool/language used to interrogate xml (or structured data?), and the xml would be the thing that would have the schema if I'm remembering how these things interact correctly?
11:24 Dyrcona So, we mostly use XSLT to convert from MARC to MODS. Some index and display fields forego the use of the MODS transforms and have a XPATH expression to extract the needed data from the MARC.
11:25 Dyrcona It's easy to add new fields with XPATH. For instance if there is some RDA field you want to extract from MARC.
11:25 Bmagic BAMkubasa: I end up referencing config.xml_transform which is a straight copy from LoC (right?)
11:26 Dyrcona I guess a more appropriate question is what is the student trying to do? Add Bibframe or something like that?
11:26 BAMkubasa Don't know, they asked theirlocal librarian, who asked me
11:26 Dyrcona Bmagic: Almost a straight copy. We've modified one or two of the transforms in the past.
11:27 Bmagic I've resorted to eding those templates too. I'm not sure if I have a copy of an Evergreen database with tweaks anymore though
11:27 Dyrcona Adding a new schema for bib records would be a big deal, i.e. a lot of work. If you have some other format that you could convert from MARC via XSLT, that would be easier.
11:27 Bmagic I think it was to include more tags for the keyword index (before the feature was added to Evergreen, making that easier)
11:28 Dyrcona Bmagic: By "we," i meant the Evergreen community/developers. Not all of the transforms are strictly stock from LoC. I also think we sometimes fall behind LoC changes to the canonical set.
11:30 Bmagic right on

Results for 2022-09-01

09:10 Stompro joined #evergreen
09:34 Stompro joined #evergreen
09:57 Dyrcona "Curiouser and curiouser," said Alice.
09:58 Dyrcona So, I'm back to banging my head against MARC record encoding because when I process records from Overdrive and encode the output as UTF-8, many of the records get double encoded. If I do the same with records from Kanopy, they don't.
09:59 Dyrcona In fact, if I write the preprocessed Kanopy records out as "bytes" instead of "utf8," that's when some of them blow up....
09:59 Dyrcona They're both sending UTF-8 for the most part as far as I can (care) to tell.
10:04 Dyrcona chardet says: Kanopy_MARC_Records__additions__joneslibrary.mrc: utf-8 with confidence 0.99

Result pages: 1 2 3 4 5 6 7