Search in #evergreen

Results

Result pages: 1 2 3 4 5 6 7

Results for 2024-11-18

11:45	csharp_	SO CONFUSING
11:45	Dyrcona	I suppose I could have used the animal name: numbat.cwmars.org....
11:47	* mmorgan	wants to say it's cool, but agrees with csharp_ SO CONFUSING!
11:49	Dyrcona	Should the fields in MARC output be more or less in order?
11:55	Dyrcona	Hm.. I can't find anything that says the blocks have to be in order. I know that fields don't have to be in order within the numbered blocks, just that all the 6XX fields have to be together, etc.
12:09		jihpringle joined #evergreen
12:15	csharp_	ejabberd--

Results for 2024-10-25

14:35	Bmagic	also: when making dramatic changes to the org units, be sure and run autogen on all of the bricks
14:37		stephengwills joined #evergreen
14:43		jihpringle56 joined #evergreen
14:45	Dyrcona	Hrm. I think grabbing marc from bilbio.record_entry slows my query down. I wonder if it would be faster to grab the ids, then go back and get mar in a second query.
14:46	Dyrcona	I'm getting around 200,000 records, so the db is probably using a lot of temp space.
14:48	Dyrcona	Yep. It's a lot faster without the marc column.
14:50	stephengwills	remind me where the soft boundary gets set please?
14:51	jihpringle56	stephengwills: it's a setting in Administration -> Local Administration -> Library Setting Editor
14:51	stephengwills	tx

Results for 2024-10-21

10:51	berick	Bmagic: https://demo.evergreencatalog.com/osrf-gateway-v1?service=open-ils.search&method=open-ils.search.biblio.marc.staff&param={%22org_unit%22:1,%22limit%22:10,%22offset%22:0,%22searches%22:[{%22term%22:%22ocolc%22,%22restrict%22:[{%22subfield%22:%22a%22,%22tag%22:%22035%22}]}]}
10:51	berick	well, remove ".staff" as needed
10:52	Bmagic	ty! I'll work with that and see if I can get it to go my way
11:05	Dyrcona	Bmagic: Are you looking for MARC-style data for a record?
11:05	Bmagic	xml is fine
11:06	Bmagic	this api returns json with the bib ID's then a separate call is required to get the details of the bib
11:08	Bmagic	a wrinkle in my conundrum is that the bibs I'm trying to turn up are electronic bibs (no items) - it seems that the public(non staff) API call won't turn those up no matter which scope I use.
11:10	Bmagic	there is a setting for marc tag search in the OPAC?
11:10	Dyrcona	Well, for whether or not and how records without items, most electronic records, show up in the OPAC.
11:10	Bmagic	What I'm finding is that Evergreen won't give me results for electronic scoped bibs when searching marc tags in the OPAC
11:11	Dyrcona	There might be a bug. I don't use MARC tag search.
11:11	Dyrcona	There are settings to control if scoped records show up or not depending on the scope.
11:11	Dyrcona	It might be those settings.
11:12	* Dyrcona	is doing like 3 things at once right now, so if I missed that you've scoped the search, then I apologize. :0

Results for 2024-08-22

08:46		dguarrac joined #evergreen
09:07	* Dyrcona	shares his non-constructive comments of the day.
09:09	Dyrcona	But, honestly, "Smart Quotes" at a system-wide level is kind of a dumb feature. I can understand it in a word processor or even email user agent, but most anywhere else, it's not what you want.
09:13	Dyrcona	PS: "Smart quotes" also break MARC records....So, no cataloging on your iPhone thanks. :)
09:37		mantis joined #evergreen
09:42	mmorgan	Smart quotes == Dumb quotes
09:49	Dyrcona	In light of eeevil's comment on Lp 2069363, I am more inclined to say we shouldn't fix it.

Results for 2024-07-30

10:02	Stompro	Shoot, I just looked at my calendar and realized the Release team check-in meeting was this morning.
10:03		kworstell_isl joined #evergreen
10:07	Dyrcona	Stompro: It's all right you missed the meeting. We discussed the 3.next target and some bugs.
10:07	Dyrcona	Also the Perl MARC code tries to deal with bad files like that. Not sure how it deals with that many nuls in a row, though.
10:12	Stompro	Dyrcona, if there are any specific bugs that need sign offs or more testing let me know.
10:13	Dyrcona	Well, there are 12 signed off bugs in total. I was going to have a look at those this week. If you want to take some, feel free.
10:14	Dyrcona	I'm on vacation next week.

Results for 2024-06-06

13:24	Bmagic	do any of you index the 035 for OCLC number searching?
13:46	Dyrcona	Bmagic: I'm looking, but I don't think we do.
13:46	Dyrcona	Related to my earlier conversation: Looks like the patch works!
13:48	Dyrcona	Bmagic: We don't but I think you'd want a marc on 035, not sure which subfield.
13:49	Bmagic	it's $a
13:49	Dyrcona	That is a marc xpath.
13:52	Bmagic	yeah, this is what I've got: //marc:datafield[@tag='035']/marc:subfield[@code="a"]
13:53	Dyrcona	Is it working? field_class = 'identifier'?
13:53	Bmagic	I'm concerned about the commonly prefixed OCLC numbers "(OCoLC)" - and whether or not the index will allow for us to search by just the bare number

Results for 2024-05-14

15:25	csharp_	ah - just read the eeevil quote
15:25	eeevil	heh ... csharp_, no, not necessarily
15:26	eeevil	(that isn't a direct quote, fwiw :P)
15:26	Dyrcona	Bmagic: Saving MARC records can be slow regardless of JIT.
15:26	redavis	lol
15:26	Bmagic	right, I was paraphrasing
15:26	csharp_	(we're still on PG 11 fwiw)

Results for 2024-03-08

10:02		jvwoolf joined #evergreen
10:44		sandbergja joined #evergreen
11:13		kworstell-isl joined #evergreen
11:43	* Dyrcona	is trying to test marc_stream_importer, and I'm pretty sure that I have it set up correctly. When I try to cat a marc file at it with nc, the record never queues, and I get no error messages.
11:43	Dyrcona	I can see the connection to the importer in the logs, but nothing else that looks related.
11:54	Dyrcona	There's nothing in the log after the connection except stream importer saying it is startng a child. No osrf syst activity. nada.
12:02		jihpringle joined #evergreen

Results for 2024-01-24

10:43	Bmagic	Stompro: in this example, eight
10:45	Stompro	I was planning on looking at how to turn marc_export into a multi process exporter... but I'm only a part time programmer, not sure when I'll get back to that.
10:46	Dyrcona	marc_export would be faster if your run multiple instances. Just split your ID file up and run multiple exports, then splice them together at the end.
10:46	Dyrcona	you can just cram binary MARC files together. XML would require a bit more work.
10:47	Bmagic	Dyrcona: yes, that comes to mind. Splitting the ID's then running 8 processes of marc_export. Then we'd need to combine the outputs at the end. My extractor is here: https://github.com/mcoia/mobius_evergreen/tree/master/bib_extract
10:47	Bmagic	be gentle, that code is old and junky
10:48	Bmagic	It keeps working for me, so I've not prioritized it's rewrite

Results for 2024-01-23

15:09	Dyrcona	Pg 15 mostly.
15:09	Dyrcona	Are you having issues?
15:10	Bmagic	oh good. I have done the same recently. It worked (pg 15 for me too). Though, there were many records it threw some console errors about. It still resulted in a marc file. And that file contained errors according to MARCEdit, which it happily stripped out for me using the validator tool.
15:11	Dyrcona	What console errors? I'm not sure that has to do with the Pg version so much. It's more likely down to character set issues in the MARC.
15:11	Bmagic	the DB is "C" and UTF-8, same as it was on PG10. I'm troubleshooting an export for a VuFind instance. VuFind doesn't like the export (all of a sudden) - one change we made was upgraded to pg15 from 10. Just trying to rule that out as a possible issue. I think it's just plain bad records that were introduced recently. and the pg version is a red herring
15:12	Bmagic	yep, character set issues. Which, we're no strangers to. But the underlying DB version could play a role.
15:13	Dyrcona	Did you upgrade Ubuntu, too? There was an Ubuntu upgrade that required reindiexing the database or something because the Unicode library version changed.
15:30	Dyrcona	I might as well start one of them now.
15:31	Dyrcona	I should also make sure that they're using the same marc_export.
15:32		jvwoolf joined #evergreen
15:35	Dyrcona	Bmagic: you are dumping binary MARC with encoding UTF-8?
15:36	Dyrcona	I've got one that, for some reason, dumps XML then uses yaz-marcdump to convert it to binary MARC.
15:37	Dyrcona	XML would be easier to compare.
15:37	Dyrcona	...Even if the file is bloated.
15:38	* Dyrcona	started a binary dump already and decides to let it go.

Results for 2023-12-22

09:43	Dyrcona	We found numerous "bad" bibs as a part of this project.
09:44	Dyrcona	Also: https://bugs.launchpad.net/evergreen/+bug/2045440/comments/7
09:44	pinesol	Launchpad bug 2045440 in Evergreen "Add Public Copy Notes to MARC export when items requested" [Wishlist,New] - Assigned to Jason Stephenson (jstephenson)
09:45	Dyrcona	Stompro: Audience and literary form should be properly encoded in the MARC. I think it would be bad to have different Evergreen sites with customized exports for Aspen, etc. We really should be doing "standard" things as much as possible.
09:59		sleary joined #evergreen
10:06		mantis1 joined #evergreen
10:21		mantis1 joined #evergreen

Results for 2023-12-19

16:06	Bmagic	I always assumed that pingest did all the same things as saving the record in the marc edit interface
16:10	Dyrcona	Bmagic: Depends. pingest runs the database functions. It may also be out of date for newer ingest requirements.
16:12	Bmagic	there is some magic that the interface is doing
16:13	Dyrcona	No. It's all done in the database. Updating the marc runs a trigger in the database. There may be new parts of that which pingest doesn't know about.
16:13	Bmagic	I have rows for this record in metabib.title_field_entry, as well as reporter.materialized_simple_record
16:14	Bmagic	so, it looks good from the database side, but the item status still doesn't show the title. It must be getting it from some other table
16:14	Dyrcona	Well, the new OPAC, staff client, etc. don't always use the same fields.

Results for 2023-12-08

15:30	Dyrcona	"They say these days are made of rust..." Or should that be Rust? Eh, berick?
15:33	Stompro	I'm looking at how Aspen categorizes things as fiction/non-fiction... and it categorizes poetry as Fiction by default? Which seems wrong for how Libraries usually categorize things.
15:33	Dyrcona	Command line programming with PHP: I don't recommend it if you don't need to for some crazy reason... like oh, testing the Aspen Evergreen driver without installing Aspen.
15:35	Dyrcona	Stompro: Most of that comes from the MARC, I wager. Not sure if poetry can also say "fiction" in the coded values/wherever that lives, but I've seen lots of crazy stuff in MARC records over the years...decades.
15:36	Dyrcona	Come to think of it, I don't recommend web programming with PHP, either.
15:36	Dyrcona	:)
15:37	Dyrcona	Poetry should probably be its own category though.
15:57	kmlussier	I'm also not surprised jweston has published a book in every Dewey classification. She's very talented.
16:03	sleary	I have THOUGHTS about the fact that MARC has a field for festschrift but not fiction/nonfiction.
16:03	* berick	had to google
16:04	Dyrcona	I have many thoughts about MARC....most of them... not good.
16:04	JBoyer	I lack the energy to act on it but Dyrcona and berick's conversation above has me thinking about a Malware Radio parody of Wall of Voodoo's Mexican Radio, which I enjoyed quite a bit back in the day.
16:04	berick	yeah...
16:04	Dyrcona	"I'm a Mexican woa-oah radio."

Results for 2023-12-06

09:22	JBoyer	^ too real robot, too real.
10:00	Dyrcona	2 days 1 hour 4 minutes at this point. :)
10:06	Dyrcona	using 68GB (roughly) of RAM on the DB server for this one long query now.
10:12	Dyrcona	i should have disable the ingest. The only thing changing in the MARC is the 901.
10:53		sandbergja joined #evergreen
10:56	sandbergja	gmcharlt++ # opensrf releases
11:02		briank joined #evergreen

Results for 2023-12-05

10:36	Dyrcona	Actually, it's probably smple rec and some of the other triggers, too. I'm not sure I want to disable simple rec in a transaction. I'll have to investigate the triggers a bit more.
10:40	pinesol	News from commits: LP1850473 (follow-up): Update DOM selector in nightwatch test <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=bddc372d27e4ac9298ef6d265534c3b14529b2a2>
10:41	Dyrcona	No, I don't want to disable any triggers. It would lead to locking and possible dead locks. Setting session replication role is overkill since it disables all triggers.
10:45	Dyrcona	Looks like the ingest triggers are getting fired, too, but they.... Oh wait. The marc changes....
10:53	Dyrcona	Trying different batch sizes, with a limit on a subquery, It did 10 in 5 seconds 100 in 26 seconds and 1000 took 4 minutes 23 seconds and used a lot more memory.
10:53	* Dyrcona	tries 500.
10:56	Dyrcona	1m40.676s...
14:01	csharp_	reingest began ~11:30 a.m. yesterday - a little over half done per the batch count
14:07	Dyrcona	csharp_: My case was nearly catastrophic. Cascading deadlocks stemming from a process trying to delete a staff account with thousands of owned records. It was acquisitions account. The person doing the delete tried it at least 3 more times after the first timed out in the client.
14:10	Dyrcona	This sort of thing used to happen so much in Horizon with Sybase, that I made a little GUI in Java to show me the deadlocked processes. From that I could identify the most likely culprit process. I could then click its row in the table, type Ctrl-k and that would send the Sybase equivalent of pg_cancel_backend. Every now and then, I consider adapting that to PostgreSQL and Evergreen.
14:13	Dyrcona	Looks like someone was also loading MARC order records at the same time. I may have clobbered their backend process, not sure. I tried to only cancel the delete_usr queries.
14:23	Dyrcona	Speaking of long-running processes my update of tcn_source on 530,105 bibs where it is '' has been running for 1day, 5 hours, and 27 minutes at this point.
14:25	Dyrcona	Fortunately, that is on a test system, or I'd have probably had to cancel it for deadlocks.
14:29		smayo joined #evergreen

Results for 2023-12-01

12:04	pinesol	Dyrcona: well, that's what you get for not being a shell script
12:04	Dyrcona	pinesol: I sometimes wonder if I'm not just a Perl one-liner.
12:04	pinesol	Dyrcona: Have you confirmed your ISBN SPIDs with your service provider?
12:06	Dyrcona	So, I am having fun with malformed MARC rerords. They have empty 901$b, and it turns out that tcn_source in biblio.record_entry is null for these records. That isn't supposed to be because the column is defined as "not null default 'AUTOGEN'::text."
12:07	Dyrcona	Now I have to figure out what, if anything, these records have in common and how they're getting a null in bre.tcn_source when that should not be possible.
12:08		jihpringle joined #evergreen
12:09		Christineb joined #evergreen

Results for 2023-11-17

09:27	Bmagic	kmlussier++
09:44	Dyrcona	So, I think I've found a "solution" to my MARC4J streams problem. I'm going to have two programs. The export program will write out the marcxml from Evergreen to a file and also write a file to map the database id with the record's position in the marcxml file.
09:44	Dyrcona	The second program will read both files, parsing the marcxml with MARC4J and using the other file as a key to map the current record with the database id.
09:45	Dyrcona	The first program doesn't have to be written in Java, and I suppose I could output a binary MARC file to same some space. MARCXML just seemed simpler, since I could just print the marc field directly.
10:03	sleary	kmlussier does pinesol know about matcha frapps?
10:03	Dyrcona	my $file_content = do{local(@ARGV,$/)=$filename;<>}; # That's a gnarly trick.
10:05	kmlussier	sleary: It certainly can be added as a dessert. :)
15:52	Dyrcona	Hmm... The permissive stream reader include binary in the exception output, looks like field separators or other low ASCII control characters.
15:54	Dyrcona	Maybe this has not been worth the effort.
16:19		jihpringle joined #evergreen
16:44	Dyrcona	I wonder, too, if sometimes the errors cascade. That is an error reading a previous record ends up messing up the read of the next few records even if they might be correct. There are things that will do that with MARC::Record.
16:51		jihpringle joined #evergreen
17:05		mmorgan1 left #evergreen
18:21		sandbergja joined #evergreen

Results for 2023-11-16

11:21	jeff	oh, never mind. I think that's exactly what you said you were doing, and I misread.
11:24	Dyrcona	Yeah. We're sending records to someone parsing them with MARC4J. I'm trying to implement a program to find records that MARC4J doesn't like and then output a spreadsheet of the errors for our catalogers.
11:24	Dyrcona	I may go back to java.nio.Pipe. I had that sort of working, but when I added InputStream to the mix, the program would hang.
11:26	Dyrcona	I know I'm getting I/O deadlock, and I'm using classes that are recommend for use with different threads in a single thread. Maybe if I spin off a thread for the MARC reader, but then I need to also get the database Id in that thread somehow.....
11:26	Dyrcona	There's too much computer science in Java. It's definitely not a hacker's language.
11:27		kmlussier joined #evergreen
11:28	Dyrcona	I am flushing the OutputStream before trying to read from the other end of the pipe.... Pipes in C are so much easier. Well, maybe I'm more familiar with pipes in C. :)
12:12	Dyrcona	That's at 11:24 EST. :)
12:13	Dyrcona	I've already done this for a set of records that evergreen-universe-rs doesn't like.
12:13	berick	oh, gotcha
12:15	Dyrcona	I think I'll put this down for now and work on a program to convert some marcxml to binary marc to see if CPAN's RT will let me upload that. I get a 403 when I try to upload the marcxml examples from the Rust test.
12:15	Dyrcona	"That should only take half an hour," he said knowing it was very likely to be a lie.
12:16	Dyrcona	Also, mexican_coca-cola++ It tastes so much better with cane sugar than with HFCS.
12:22	Dyrcona	What? libmarc-perl does not install MARC::File and friends? I thought that it did.
12:23	* Dyrcona	grumbles about CPAN.....and that half hour will be spent just installing the tools.
12:33	Dyrcona	marcdump [options] file(s) That's useful....
12:34	Dyrcona	And, I have to write my own. marcdump doesn't work on XML.
12:35	Dyrcona	I'm just full of complaints today, aren't I?
12:44	Bmagic	you? never!
12:44	Dyrcona	I'm installing MARC::File::XML with cpan set to local::lib, and there sure are a lot of prerequisites.
12:50	Dyrcona	Failed 11/11 test programs. 3/5 subtests failed.
12:50	Dyrcona	Right. I'll just run it on a server where this is already installed.
12:51	Dyrcona	And, I'll wipe out the stuff that CPAN installed locally.
13:08	Dyrcona	Looks like I may have to reboot. I just swapped monitors and the laptop doesn't see the new one.
13:14		Dyrcona joined #evergreen
13:16	Dyrcona	hey! That's funny. MARC::Batch catches some of these errors: Leader must be 24 bytes long
13:19	Stompro	Dyrcona, have you looked at MARC::Lint already?
13:33	Dyrcona	Stompro: Never heard of it.
13:33	Dyrcona	Apparently, all of the software in the world has chosen this week to hate me: https://rt.cpan.org/Ticket/Display.html?id=150348&results=a5d68555ff4b4354e65ce6ec51f76634 # Read to the bottom...
13:37	Dyrcona	gmcharlt: RT on CPAN is apparently broken for uploads at the moment. I've tried 3 times to add a file of records to that ticket above.
13:39	Dyrcona	Stompro++ I'll give MARC::Lint a whirl.
13:57	Dyrcona	Stompro: It looks like MARC::Lint may help. I'm running a test program already.
13:59	Stompro	I wonder if it will be too verbose, or if you can pick out the bigger issues. I'm curious how it performs also?
13:59	Dyrcona	And, maybe not so much: is_valid_checksum: Didn't get object! at /usr/share/perl5/Business/ISBN.pm line 481, <DATA> line 244.
13:59	Dyrcona	Well, it gets totally clobbered by our data after bib id 233519.
14:05	Dyrcona	Dunno. That could be what it exploded on. I'm trying again with an eval BLOCK.
14:06	Dyrcona	If it gets all the way through I'll use CSV, and output the warnings to a csv. I might output the errors to a separate one.
14:08	Dyrcona	My catalogers will be sorry that they ever asked for this. :)
14:10	Dyrcona	MARC::Lint seems to find something in nearly every record.
14:39		terranm joined #evergreen
14:54	Dyrcona	hmm... What's the limit of rows in Excel, 32,000? I may have to split this up.
14:55		kmlussier1 joined #evergreen
15:03	* Dyrcona	tries hot swapping monitors again. If I disappear, I had to reboot.
15:10	Dyrcona	Well, looks like I have to reboot.
15:16		Dyrcona joined #evergreen
15:28	Dyrcona	Looks like MARC::Lint uses its own eval, and the errors that it passes up to the client program are not very useful for a cataloger: "Can't locate object method ""checksum"" via package ""0316110620"" (perhaps you forgot to load ""0316110620""?) at /usr/share/perl5/Business/ISBN.pm line 484, <DATA> line 244."
15:32	Dyrcona	Stompro: Do you know about Tk::MARC::Editor and MARC::ErrorChecks?
15:41		dluch joined #evergreen
15:45		jihpringle joined #evergreen
15:49	Stompro	Dyrcona, nope, I haven't looked at them before.
15:50	Dyrcona	I had a quick look at MARC::Errorchecks and it seems more cumbersome and nitpicky than MARC::Lint.
16:06		pinesol joined #evergreen
17:08		mmorgan left #evergreen
18:26		briank joined #evergreen

Results for 2023-11-15

09:56		sleary joined #evergreen
09:58	Dyrcona	Ugh. I'm going to have to take the time to write a Perl program to convert these bad MARCXML records to binary. yaz-marcdump can't read 275 of them.
09:59	Bmagic	marc--
10:04	Dyrcona	Well. part of the problem is MARC::Record tries to hard to read bad records, and some legacy systems produced too many bad records.
10:11	Bmagic	Dyrcona++ Dyrcona++ # Aspen woo
10:20	csharp_	@dunno add Automate ALL THE THINGS!
10:20	pinesol	csharp_: The operation succeeded. Dunno #83 added.

Results for 2023-11-07

12:01		Rogan joined #evergreen
12:10		jihpringle joined #evergreen
12:13		jblyberg7 joined #evergreen
12:21	Dyrcona	Y'know what? I'm done with captchas. If a site asks me to solve a puzzle or some crap, I'm going away and not coming back. So, how is this relevant here? Well, it means no bug report on MARC::Records' RT. Well, let's see what happens if I sign in, first.
12:24	Dyrcona	It wants you to solve a puzzle to logout of "guest." I know they've had issues with spam, but I'm so sick of "proving" that I'm human. (Also, turns out that bots are better at solving some of the puzzles than humans, so they consider failure to be success, too.)
12:31	pinesol	News from commits: LP2039612: regression test for creating carousels <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=b418f2dfa8b9ec16d132b74c25c240480d2b5ed0>
12:31	pinesol	News from commits: LP2039612: Fix Carousel create / edit <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=00bd4c88d60fedaf1a39a7d34a24f0c2e2bd2c00>
16:08	Dyrcona	Hey! Wow! < 2 minutes with the cursor.
16:18	Dyrcona	Hmm... Output is truncated. I think I have to actually close the Writer or File or something.
16:19		Rogan joined #evergreen
16:28	Dyrcona	MARC4J doesn't seem to have any interface for taking a blob and turning it into a MARC record. it seems rather file and stream oriented.
16:32	Dyrcona	That's enough for today. My eyes are tired.
17:10		mmorgan left #evergreen
17:32	pinesol	News from commits: LP#2030820: scope line item alert types based on workstation OU <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=e2cb64d73544da47cd95e522c221ad13d9856705>

Results for 2023-11-06

09:00		sleary joined #evergreen
09:00		smayo joined #evergreen
09:02		mmorgan1 joined #evergreen
09:10	Dyrcona	Stompro: It is 1,736,893 bibs. I can't really count the items because the main file is MARC binary.
09:11	Dyrcona	BTW: I think the Perl MARC::Record is too permissive. I'm going to do some research on it and probably open a ticket in CPAN's RT.
09:12	Dyrcona	We have a record with and empty 008 as an example. Evergreen deals with it just fine, but other systems mangle it.
09:24	Dyrcona	The simple things seem to be a problem for me this morning, like opening files in Perl.... I know what the problem is... No music is playing!
09:25		mantis1 joined #evergreen
14:00	terranm	Weekly 3.12 code review if anyone wants to join - https://www.google.com/url?q=https://princeton.zoom.us/my/sandbergja
14:13		ejk_ joined #evergreen
15:21	jeff	terranm++ for sharing the link here
15:36	Dyrcona	So, I'm thinking of implement a third MARC exporter for Evergreen. This one would be written in Java using MARC4J to catch things that Aspen won't like. It's not so much meant to be for every day use.
15:54	Dyrcona	Looking through the code I wrote the MVLC migration to Evergreen from Horizon has given me some ideas. I had a SAX parser in there to check the MARCXML before trying to load a record. It would delete "empty" subfields and control fields. I could just yank the MARCXML from the database and check for busted elements as a start.
16:07	csharp_	jeffdavis++ # PG 14
16:59		mantis1 left #evergreen

Results for 2023-11-02

10:08	Dyrcona	I'm not entirely sure how vis_attr_vector is used, but I'm sure it's most important for records with URIs.
10:13	sleary	sandbergja: thank you for changing that route :)
10:14	Bmagic	update config.internal_flag set enabled='f' where name~'ingest.reingest.force_on_same_marc'
10:15	Dyrcona	Bmagic: Yeah. You may or may not want that flag enabled all the time. It depends on how much/how often your MARC actually changes.
10:16	Dyrcona	You might be surprised to see how often MARC gets updated when it hasn't changed.
10:16	Bmagic	I think that was the trick, still playing with it
10:18	Bmagic	confirmed, that was it
10:18	Bmagic	I could have swore I checked that before I started posting here
10:56	Dyrcona	Awesome sauce! The "obvious" approach works.
10:57	Dyrcona	`psql -v outputdir=output -f script`, then in the script: `\o :outputdir/outfile.dat`.
10:58	Dyrcona	Think I'll shorten it to 'outdir' for the actual thing, though.
11:24	Dyrcona	Bmagic (and berick for that matter): I'm going to run the Rust MARC exporter either later today or tonight to capture the error output. It seems to find more "bad" records than the Perl code. Just thought I'd give you a head's up. I don't expect the load to spike on the utility server, but you never know.
11:25	Bmagic	Dyrcona++ # you go on with your bad self
11:25	berick	Dyrcona: cool, be curious to see what you find
11:26	Dyrcona	Actually, I'll schedule it for 9:00 PM since we don't seem to have any database updates requested. I'll write something to parse the error output afterward. (It will be good practice.)
11:26	berick	also what Bmagic said
11:26	Dyrcona	:)
11:27	Dyrcona	I'm going to extract all of the records, and I won't bother with holdings.
11:28	Dyrcona	berick: Can the eg-marc-export do authorities, too? If not, that would be a useful feature.
11:28	berick	Dyrcona: not yet
11:28	Dyrcona	I'll work on a pull request, then. ;)
11:28	berick	awesome

Results for 2023-10-30

10:43		BAMkubasa joined #evergreen
10:50		mantis1 joined #evergreen
10:55		briank joined #evergreen
11:15	Dyrcona	OK. I have the latest Rust marc-export test running. I'll see how long this takes.
11:27		jihpringle joined #evergreen
12:14		Christineb joined #evergreen
12:18	eeevil	berick: drive by question (from immediately-post hack-a-way) incoming! meetings will probably keep me away from here for the rest of the day, but want to get it to you asap for RediSRF(?) goodness

Results for 2023-10-27

10:19	Stompro	I figured, my perl array skills need work. :-)
10:20	Dyrcona	Maybe my suggestion requires more rearrangement of the code, though. Having a firsttag flag might fit better with the current code organization.
10:20	Dyrcona	I wonder if the first one even needs to be grouped?
10:20	Dyrcona	I'm going to look at MARC::Record again.
10:21	Dyrcona	Stompro++ # For the notes in the snippets.
10:21	Stompro	In my test data, the 901 tag would be placed before the 852 without using the insert_grouped_field for the first.
10:23	Stompro	I don't think MARC::Record re-orders the fields.
12:38	Dyrcona	Heh. Almost 1 minute longer.....
12:50		collum joined #evergreen
13:02	Dyrcona	I am testing this now: time marc_export --all -e UTF-8 --items > all.mrc 2>all.err
13:14	Dyrcona	The Rust marc export does batching of the queries by adding limit and offset. I wonder if we should do the same? I've noticed that the CPU usage goes up over time, which implies that something is struggling through the records. The memory use stays mostly constant once all of the records are retrieved from the database.
13:20	Stompro	Dyrcona, if you use gnu time, it gets you max memory usage also. /usr/bin/time -v... so you don't have to check that separately.
13:25	Stompro	Dyrcona, I'm surprised the execution time increased for you... hmm.
13:28	Dyrcona	Things are always weird here.
16:13	berick	Stompro++ eeevil++ # looks like cursors are an option -- will give it a poke
16:17	Dyrcona	I'll give cursors a poke, too.
16:17	Dyrcona	I think I commented about "cursors and Sybase" and my early experience with them at The Jockey Club and with Horizon last week.
16:19	Dyrcona	BTW, my maximum memory usage is 9GB. I think that's my biggest issue with MARC export.
16:19	eeevil	rewindable and writable cursors, and with-hold cursors, are not as fast as not-those-types in PG, but we don't need those, generally.
16:19	Stompro	With --items, 1.3G vs 256M for max resident memory, 596s vs 473s run time. (That as compares the 852 insert changes).
16:19	Stompro	s/as/also/

Results for 2023-10-25

09:18	kmlussier	sleary: I hope you enjoyed your first hack-a-way!
09:21	Dyrcona	So, for those following the marc_export saga: I'm logging the time spent in subroutines in microsecond granularity using Time::HiRes. I hope this points to where the trouble is. I'm also only exporting 1 library's items. They have about 145,000 records so it should finish in a "reasonable" amount of time.
09:26		redavis joined #evergreen
09:27	Dyrcona	berick: I'm going to do some more research on Rust and look into adding a debug option to the Rust marc-esport.rs. I'd like to log the queries it runs, because it looks like the main query ran for 5 secs each time it was called on my produciton db the other day. I want to get the whole thing to run through explain analyze. It gets truncated in the postgres logs.
09:37	Dyrcona	Rogan left. I wanted to ask him about some queries he wrote, i.e. why he used subqueries instead of join in some places. Is it because the subqueries were faster or was that just how it came to him? I have seen subqueries be faster than joins in some cases, but not always.
09:38	Dyrcona	I suppose I could test it on my data and adjust if appropriate. It often depends on indexes and number of rows.
09:49		terranm67 joined #evergreen

Results for 2023-10-20

09:01	mantis1	This season has been terrible
09:02	mantis1	I hope you'll be ok for Hackaway!
09:06		Dyrcona joined #evergreen
09:08	Dyrcona	berick: I did `cargo build --release --package evergreen` then copied eg-marc-export to /openils/bin/. I missed the password on of the two lines for eg-marc-export in my script, so I don't know if it is faster, but the binary is certainly smaller without the debugging symbols, etc.
09:14		redavis joined #evergreen
09:18		terranm joined #evergreen
09:18	Dyrcona	FWIW, I haven't used --release on my test system. I did that for the production server.
10:10	csharp_	sounds good too
10:21	Dyrcona	Hmm. One of our marc_exports is still running since Tuesday night.
10:21	Dyrcona	I wonder if Perl has some kind of issue on virtual machines?
10:22	Dyrcona	Well, I can always replace it with eg-marc-export.
10:32	berick	Dyrcona: fwiw, the --release build chopped off 1/3 of the runtime for my 150k record+items export.
10:32	berick	depends on the data, i'm sure, though
10:34	* JBoyer	wonders what Kuma's Korner does for birthdays?

Results for 2023-10-19

10:28	mantis1	was missing one of those thank yo!
10:32	jeff	recall holds are a thing that requires some additional A/T setup. I don't know if the defaults are suitable / functional out of the box. Org unit settings likely also.
10:33	jeff	I was actually just wondering if it made sense to have an option to hide recall as a hold option, for those reasons and more. :-)
10:39	Dyrcona	berick: My test of the Rust marc export finish in 9 hours 23 minutes, and exported a 3.4GB binary file with 1,737,349 records. There are 411 records in the error output. That looks good to me, compared to what I'm getting from Perl.
10:46	csharp_	rs++
10:57	Dyrcona	csharp_: Do you think the marc_export is slow with the --items option?
10:57		briank joined #evergreen

Results for 2023-10-16

14:30	Dyrcona	berick: Speaking of Rust... I think you might have introduced a bug with a recent commit when you moved where OFFSET gets addded. I got the following when using --query-file:
14:31	JBoyer	Ah, still catching up here and there. Something is still bonkers with that extract though given what we're seeing here.
14:32	Dyrcona	thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: Some(Error), code: SqlState(E42601), message: "syntax error at or near \"OFFSET\"", detail: None, hint: None, position: Some(Original(630)), where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("scan.l"), line: Some(1123), routi
14:32	Dyrcona	nner_yyerror") }) }', evergreen/src/bin/marc-export.rs:398:56
14:33	Dyrcona	JBoyer: --items has always been slow for us, but it's worse than ever, and it really looks like it is Perl.
14:34	berick	Dyrcona: k. mind sharing your query file?
14:35	Dyrcona	berick: I'm modifying one of them. Let me try another run.
14:35	berick	Dyrcona: i think i see the issue..
14:37	Dyrcona	One of the queries returns 3 columns: bre.id, bre.marc, and count(acp.id). It didn't loook like the 3rd column would be an issue.
14:38	Dyrcona	JBoyer: I would not be surprised if there is something wrong with the Perl versions I'm using or something, but I don't feel like I have time to deal with that. I'm under pressure to get them records last week. :)
14:40	berick	the chunked processing requires ordering/limiting/offseting which adds additional restraints to the format of the query file. for now, could just read query file as-is and avoid any paging.
14:41	JBoyer	True, finding the Right Fix when you're under a Right Now deadline is a lot like being technically correct but completely unhelpful. :) I just think that after you can get the initial export done and transported that replacing the exporter won't necessarily be the ideal fix. (Though, for later, I'm also curious what os the super slow exporter is running on)
14:51		jihpringle joined #evergreen
14:52	berick	Dyrcona: pushed a patch to avoid modification of the --query-file sql
14:52	berick	re: id, any chance you have multiple columns resolving to the name "id"?
14:53	Dyrcona	berick: Cool. I might.... I'm going to modify the query that I think is blowing up to use a CTE, and then grab bre.id and bre.marc.
14:53	Dyrcona	Currently, it's actually returning acn.record, bre.marc, and the count on acp.id.
14:56	Dyrcona	Dude..... I just noticed the file ends with two semicolons.....I'll bet that's it.
14:56	Dyrcona	Still, I think I'll do the CTE.
14:57	berick	my test file contain: SELECT id, marc FROM biblio.record_entry WHERE NOT deleted (no semicolons needed)
14:59	Dyrcona	Yeah, ; is a habit from writing stuff for psql.
15:07	pinesol	News from commits: LP2035287: Update selectors in e2e tests <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=f562b3ac30a3753d63d565c2d7be4d3a7121a2fb>
15:11	Dyrcona	Now, I'm cooking with gas. I got the "large" bibs file (85 records with 87,296 items) dumped to XML in under a minute. That took almost 2 hours with the Perl program the other day.
15:18	Dyrcona	Using query-file to feed the eg-marc-export, it is using a lot more RAM than before, about the same as the Perl export was using. It still uses less CPU. We'll see if that changes over time.
15:28	jeff	you have 87,296 items that are spread across only 85 bibs?
15:29	Dyrcona	Yes, we do.
15:29	jeff	color me intrigued.

Results for 2023-10-13

13:20	jeffdavis	yes, fairly frequently
13:28	pinesol	News from commits: LP#2007603: restore functioning of default search tab preference <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=adf3bd07e0e4558e9d48f9cdf5081e956eef7866>
13:55		jihpringle joined #evergreen
13:59	Dyrcona	berick: eg-marc-export appears to be way more efficient than marc_export. However, the XML it produces isn't pretty printed, so I'll have to split records with sed or something to see what I've got when it is done.
14:06	berick	Dyrcona: ah, i've been piping to xmllint --format
14:06	berick	but could easily add a --pretty-print option
14:12	berick	pushed --pretty-print-xml option. added some other options since yesterday as well.
14:13	mantis1	Is recall and force holds a library setting?
14:18	berick	mantis1: when placing an item-level hold in the staff catalog, the options should be available.
14:18	berick	they do have their own permissions
14:18	Dyrcona	berick: I could have just dumped a binary marc file. I've got a thing to count records for that.
14:19	Dyrcona	I pasted your example from yesterday and changed the filename.
14:20	jihpringle	mantis1: depending on what you're doing there are three library settings that control how recalls work
14:21	mantis1	berick: it might be the permissions then that I need to assign
14:21	mantis1	berick++
14:22	mmorgan	mantis1: I think those hold options are only available under Request Item in Item Status.
14:22	berick	mmorgan: they're in the Ang staff catalog
14:22	Dyrcona	berick is there a way to just build eg-marc-export?
14:23	mmorgan	Oh! I should have known that!
14:23	berick	Dyrcona: sorta, not really. you can build --package evergreen, but it of course builds its opensrf depenencies
14:24	Dyrcona	OK. Thanks!
14:58	berick	looking at some other options too
15:01	Dyrcona	I should have time'd this run... I'll do that next time.
15:02		mantis1 left #evergreen
15:24	Dyrcona	berick: It doesn't look like eg-marc-export can be fed a list of ids in a pipe.
15:24	Dyrcona	That was more a question, really.
15:28	Dyrcona	It finds lots of errors, too.
15:32	berick	Dyrcona: no, that's not yet supported
15:34	Dyrcona	I'm going to start over with some different options and time it.
15:34	berick	k. --query-file may need some work too. it wants 'id' and 'marc' columns in the query, which may vary from the marc_export version.
15:35	* berick	looks at --pipe
15:35	Dyrcona	Yeah, but I could use a modified version of the SQL to get the list of ids, just add the marc column.
15:35	berick	yeah
15:35	Dyrcona	marc_export just takes ids on stdin.
15:36	Dyrcona	Think I'll just dump all records with --items to a binary file, time it, and see what I get and how long it takes. I'll dump stderr to a file to see if I can fix some of these records. I see a bit of "bad subfield code."

Results for 2023-10-12

11:09	Dyrcona	ps -o etime,pcpu,rss,pmem 110013
11:09	Dyrcona	ELAPSED %CPU RSS %MEM
11:09	Dyrcona	23:29:58 97.2 9550376 29.0
11:09	Dyrcona	360853 <- Number of records in the binary MARC file.
11:10	Dyrcona	/openils/bin/marc_export --all --items <-- command line
11:12	Dyrcona	I wonder what our average copy count per bib is? Probably not that high, though we do 1 bib with over 4,000.
11:13	Dyrcona	I estimate it will export about 1.78 million records.

Results for 2023-10-11

11:02	Dyrcona	We will also likely never use it in produciton....
11:05	Dyrcona	It might need more patches than just that one.... I'll leave it for now.
11:14		kmlussier joined #evergreen
11:18	Dyrcona	So, going back to yesterday's conversation about MARC export, I wonder if that commit really was the problem. I reverted that one and two others, then started a new export. It has been running for almost 21 hours and only exported about 340,000 records. I estimate it should export about 1.7 million.
11:20	Dyrcona	At that rate, it will still take roughly 5 days to export them all. This is one a test system, but it's an old production database server and it's "configured." The hardware is no slouch. I guess I will have to dump queries and run them through EXPLAIN.
11:28	Dyrcona	Y'know what. I think I'll stop this export, back out the entire feature and go again.
11:29	jeff	if it's similar behavior as yesterday and most of the resource usage appears to be marc_export using CPU, I'd suspect inefficiency in the MARC record manipulation or in dealing with the relatively large amount of data in memory from the use of fetchall_ on such a large dataset.

Results for 2023-10-10

10:27	Dyrcona	`ps -o etime 24586` said 4-19:00:01 just a few seconds ago.
10:28	Dyrcona	I'm running it with time, but was curious how long it has been going so far.
10:29	Dyrcona	Adding --items seems to really slow it down on this setup.
10:33	Dyrcona	I should try it with a binary MARC file to see if that makes a difference. I wonder if writing the output locally is a problem, though I doubt it.
10:36	Dyrcona	The db server does not appear to be under any strain.
10:37	Dyrcona	Load is 0.08 and plenty of free RAM, which could mean its not cached, but with NVMe, who needs cache? ;)
10:44		sandbergja joined #evergreen
10:44	Dyrcona	Makes me wonder if we're missing an index, or if adding a new index might help. It would be nice if there was an easy way to dump the SQL from Perl DBI... Maybe there is. I should check.
10:48	Dyrcona	I suppose I could hack a copy of marc_export to dump the SQL instead of executing it.
10:50		briank joined #evergreen
10:50	Dyrcona	I'd like to run it through explain. It's probably the queries to grab item information to add to the MARC, so I'll have to dump an example of that, too.
10:52	Dyrcona	Guess I will be looking into it later.... sigh
10:52	jeff	or tweak log_min_duration_statement just long enough to capture some sample queries. depends on how otherwise loaded your db server is, if this is prod.
10:56	Dyrcona	This is a test system that hosts multiple databases, but this is the only instance currently doing anything.
11:03	Dyrcona	It's not running on the same server as the DB either.
11:06	Dyrcona	I'll have to do some investigation to see where the problem lies. Maybe I can get some improvements for everyone out of this.
11:20	Dyrcona	jeff++ # I may just crank the logging up for a test run later. I suspect this one will finish sometime later today, but I also thought that it would have done by yesterday to start with.
11:24	Dyrcona	FWIW, I'm dumping XML because it's "easier" to work with than binary MARC, but when a file is about 8GB in size, the format doesn't really matter any longer, does it? :)
11:25		collum joined #evergreen
11:33		kmlussier joined #evergreen
11:39		jihpringle joined #evergreen
13:41	Dyrcona	jeff: I think some of the patches that I am testing are responsible for the slow down, particularly the one for the above Lp bug.
13:45	Dyrcona	I think I'll revert a couple of commits before I say much more.
14:21	Dyrcona	Hmm... Looks like I have somewhere in the vicinity of 400,000 records left to export. I think I'll stop this one and try again with the suspected commits reverted.
14:25	Dyrcona	Think I'll export to a binary MARC file this time. At least the file will be smaller.
14:43		mdriscoll joined #evergreen
14:50		shulabear joined #evergreen
14:50		Stompro joined #evergreen

Results for 2023-09-29

12:25	kmlussier	Dhruv_Fumtiwala: No, they are stored in action.circulation
12:26	kmlussier	The aged_circulation table is where those transactions go if you set up the process to remove patron information from them.
12:34		collum joined #evergreen
13:32	Dyrcona	Nothing like doing a marc_export to find bad bib records: Warning from bibliographic record 313383: Argument "I65" isn't numeric in integer division (/) at /usr/share/perl5/MARC/Record.pm line 407.
13:41	Dyrcona	Hmm. Maybe I should not have run this test with our largest member library.... It's taking a while to produce the output. :)
13:42		sleary joined #evergreen
14:06		Dhruv_Fumtiwala joined #evergreen

Results for 2023-09-19

12:53		mantis1 joined #evergreen
13:35	abneiman	sandbergja++ # adoc assistance
13:35	abneiman	Bmagic: I pushed a followup, fingers crossed for a clean build tonight!
13:49	Dyrcona	JBoyer: Following up on a private conversation from last week: Do you know if EOLI has any marc_export patches dealing with record size? If not, I'm curious how you handle the "export oversize MARC records in XML" for Aspen.
13:58	pinesol	News from commits: Docs: followup commits to Reports docs <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=e7b4f2d7d4479f1f409a4baed68a5edc9541f9ae>
14:51	* Dyrcona	applies all the patches to marc_export.
15:30		mantis1 left #evergreen

Results for 2023-09-01

12:02	Dyrcona	jeff: S'allright.
12:02	Dyrcona	Sometimes the answer is just upgrade. :)
12:03	Dyrcona	jeff: That's interesting what you point out. I would have expected an error that bar has no column 'id.'
12:08	Dyrcona	I suppose I could try and figure out what data from MARC is being used to build the wide_display_entry title and physical_description fields, or I could just take her word that she want the 300$n, and use 245$a for the title. :)
12:09	Dyrcona	My suspicion is this query will be faster if it drops the join on metabib.wide_display_entry and just grabs the data from MARC via XPath, since biblio.record_entry is already joined.
12:09	Dyrcona	The original is still running against my test database.
12:10	jeff	more detailed example that I just created: https://www.db-fiddle.com/f/9YseNbGnFVuqPkVJK85aew/0
12:16	jeff	fun when "foo" is something like actor.usr or biblio.record_entry and "bar" is records_to_update or records_to_delete or the like. good reason to qualify your column references even when not forced to by ERROR: column reference "id" is ambiguous

Results for 2023-03-31

12:58	jvwoolf	Dyrcona: I updated biblio.record_entry in order to generate the new fingerprints. Do I have to do it again for the metarecords?
12:59	jvwoolf	I think you, JBoyer and I had this conversation before, and what I came away with seemed to work in my first test, but not the most recent few
13:03	Dyrcona	jvwoolf: There's a flag... let me look it up.
13:04	Dyrcona	ingest.reingest.force_on_same_marc <- Needs to be true in config.internal_flag if the MARC didn't change, I think.
13:13	jvwoolf	Dyrcona: This is a global flag?
13:14	jvwoolf	I don't seem to have that in the 3.9.2 database
13:16	Dyrcona	jvwoolf: It should show up in config.global_flag and config.internal_flag.

Results for 2023-03-15

09:50		kworstell-isl joined #evergreen
10:12		Christineb joined #evergreen
12:07		jihpringle joined #evergreen
12:10	Dyrcona	Binary MARC records with HTML entities in them.... I guess.... Whatever.....
12:27	berick	@decide binary-marc-with-html OR html-with-binary-marc
12:27	pinesol	berick: That's a tough one...
12:57	jeff	&2DzfVQ- is the IMAP4 modified UTF-7 (mUTF-7) encoding for U+1F355, aka the "pizza" emoji: 🍕
12:59	* jeffdavis	backs away slowly
13:02	Dyrcona	Does that pizza emoji have pineapple on it?
13:04	Dyrcona	So, chardet3 is my new friend. It tells me UTF-8 encoded MARC files are UTF-8 with 0.99 confidence, and MARC-8 encoded MARC files are ISO-8859-1 with 0.70 to 0.75 confidence.
13:04	Dyrcona	chardet3 does not know about MARC-8.
13:05	* Dyrcona	looks for a similar module to Python's chardet in Perl.
13:07	Dyrcona	libencode-detect-perl is packaged for Ubuntu/Debian.
13:12	Dyrcona	It turns out that libraries can choose UTF-8 when downloading records from Overdrive. If they don't then the records are apparently MARC-8. I don't want to have to figure that out manually, so I'm going to make my record load program do that for me.
13:15	Dyrcona	And Encode::Detect won't do what I want, since it decodes the text using the detect encoding. That will break MARC-8.
13:15	Dyrcona	I feel like I've had this monologue before....
13:26	Dyrcona	Aha! ascii if there are no "fancy" characters for MARC-8.
13:35		rfrasur joined #evergreen
14:26	Dyrcona	1 file changed, 39 insertions(+), 7 deletions(-) # Hopefully that does it!
14:43	scottangel	I'm looking at the 'strict barcodes' checkbox on the patron checkout page. Looks like it's bound to a variable called $scope.strict_barcode. The problem I'm facing is the function ng-change="onStrictBarcodeChange()" doesn't flip the boolean. Am I missing something? shouldn't there be something like $scope.strict_barcode = !$scope.strict_barcode; From what I can tell is this setting is meant to be saved w/ egCore.hatch.setItem() but
14:52	Dyrcona	Good question. I don't know.
15:13	Dyrcona	Meh... I should spell check commit messages before pushing....
16:01		BDorsey joined #evergreen
16:37	Dyrcona	Whee! MARC-8 encoded record, says it's UTF-8 in the leader but \xE1\x65 is MARC-8 for \xC3\xA8 in UTF-8.
16:40	Dyrcona	NB: I haven't done anything to the file other than inspect with my editor.
16:40	Dyrcona	Grr..... Omitted a word there.
16:45	Dyrcona	Also....Editing text in browser text boxes stinks....

Results for 2023-03-13

09:46		dguarrac joined #evergreen
10:25		Christineb joined #evergreen
10:54	Dyrcona	Hm.. I wonder how workable it would be to replace ISO-8859-1 copyright symbols with UTF-8 ones using a regex.... It seems like it would be simple, but it's the kind of thing that can lead to problems. I have a file of records that I can play with....
10:58	Dyrcona	Looks like it happens with registered trademark symbol, too. (Gotta love vendor-supplied MARC records.)
11:06	Bmagic	Love em indeed
11:16	Dyrcona	Bmagic: Does your load process handle things like that? I have a --strict option on one of my load programs that rejects records with bad characters, well any warnings are treated as errors, really.
11:16	Dyrcona	I'm considering adding code to fix copyright and registered trademark symbols since they seem to be a thing with this one vendor in particular.
11:21	Bmagic	Dyrcona: but FWIW: https://github.com/mcoia/sierra_marc_tools/blob/master/auto_rec_load/dataHandler.pm around line 800, if it dies, it will failover to readMARCFileRaw
11:22	Dyrcona	Bmagic: Thanks. I've had a glance at that code before. My issue isn't reading the records. We get these warnings when loading them: utf8 "\xA9" does not map to Unicode at /usr/lib/x86_64-linux-gnu/perl/5.26/Encode.pm line 212, <GEN1> chunk 300.
11:23	Dyrcona	They will load in the database, if I let them go in.
11:24	Dyrcona	The warnings don't occur while prepocessing the records with using MARC::Record to modify the 856 tags.
11:24	Bmagic	writing something to transcode one character to another seems doable. But I've not done it. Sorry :(
11:25	Dyrcona	Yeah, it's actually transcoding to 2 characters \xA9 -> \xC2\xA9.
11:25	Bmagic	it sounds like you'll end up having to read the file character by character instead of letting MARC::Record do it?

Results for 2023-02-17

12:10		jihpringle joined #evergreen
12:24		collum joined #evergreen
12:27	Bmagic	Dyrcona++
15:08	Dyrcona	Does anyone update MARC directly in the database, maybe using regex replace or something? I usually write a Perl program using MARC::Record. I'm thinking of trying one using a regex substring replace directly in the database.
15:08	Dyrcona	Three o'clock on a Friday is probably the wrong time to ask.
15:24	Dyrcona	The proposed change will make the records shorter by a few bytes. I should use a Perl program with MARC::Record.
15:30	mmorgan	Dyrcona: I've often wondered if that would be a good idea for certain projects, though I can't think of specific ones at three o'clock on a Friday (before a long weekend) ;-)
15:32	Dyrcona	mmorgan: I recall receiving the suggestion to add a database Perl function, but I'm not fond of that for one shot updates.
17:05		mmorgan left #evergreen

Results for 2023-01-24

08:28		mantis1 joined #evergreen
08:34		mmorgan joined #evergreen
09:14		Dyrcona joined #evergreen
09:18	Dyrcona	So, going back to my UPC problem from yesterday... If the goal was to make something work for all standard numbers, this would have been a better xpath expression: /marc:datafield[@tag='024']/marc:subfield[@code='a' or @code='z']
09:19	Dyrcona	Doing the 'or' on all of the documented indicators was kinda dumb, and introduced the bug.
09:36	Dyrcona	So, I don't think this bug was noticed until display fields started being used in the OPAC/staff client.
10:01	Dyrcona	So, I think what I'm trying to do now could benefit from indexes being added to the field column on metabib.{keyword,identifier}_field_entry tables. Think I'll do that and drop the indexes when I'm done.

Results for 2023-01-23

16:02	Dyrcona	I wonder if people know what they're talking about? The example record that I'm given was last updated in 2021 according to the dev database.
16:05	mmorgan	Dyrcona: What is your xpath for config.metabib_field.id = 20?
16:06	Dyrcona	I think I pasted it earlier.
16:07	Dyrcona	Guess I didn't. /marc:datafield[@tag='024' and @ind1='1' or @ind1='2' or @ind1='3' or @ind1='4' or @ind1='5' or @ind1='6' or @ind1='7' or @ind1='8']/marc:subfield[@code='a' or @code='z']
16:09	jeff	that looks... non-stock.
16:10	jeff	and i worry a bit (without refreshing my xpath syntax) about the AND OR OR... without parens.
16:10	Dyrcona	It is non-stock, but it is correct.

Results for 2023-01-19

13:02	miker	jeff: re LP 1829295, without wading into the bug itself, a big +1 to a YAOUS for respecting closed dates (and, you can just delete the row from config.org_setting_type, probably through the UI as the admin user!)
13:02	pinesol	Launchpad bug 1829295 in Evergreen "Shelf expire date doesn't respect closed dates" [Wishlist,Confirmed] https://launchpad.net/bugs/1829295
13:15	jeff	miker: thanks for the feedback! i may have only half-followed you, though. which org unit setting type are you referring to?
13:55	Dyrcona	"Smart" quotes in MARC.....That seems to be what's causing problems with record sizes.
13:55		mantis1 joined #evergreen
13:56	rhamby	smart_quotes--
13:56	sleary	ugh
14:03	Dyrcona	sleary++
14:03	Dyrcona	I'm leaning towards Windows and "copy and paste" cataloging. I was just converting from octal to see what the value is to look it up in UTF-8.
14:05	sleary	Quotes and apostrophes copied from Word in Windows used to truncate content in WordPress constantly. Good times.
14:06	Dyrcona	They truncate MARC records, too, because one of the characters in the sequence is the MARC End of Record character. I specifically use code to look for End of Field followed by End of Record to avoid this.
14:07	Dyrcona	Looks like our Perl MARC code doesn't calculate a proper record length, but those characters shouldn't be in a MARC record in the first place.
14:07	rhamby	utf16--
14:08	Dyrcona	Heh. If only everything was big endian UTF-32....drive manufacturers would be happy.... :)
14:16	mmorgan	@quote get 232
14:16	pinesol	mmorgan: Quote #232: "<mmorgan> Smart quotes are kinda like smart TVs in that neither are all that smart" (added by Dyrcona at 04:47 PM, October 19, 2022)
14:17	Dyrcona	mmorgan++
14:18	Dyrcona	There's a bug in the CPAN RT for MARC::Batch or MARC::Record (maybe on Github, too), and I don't think the solution actually works.
14:23	Dyrcona	I thought tsbere had a proposed solution this one: https://rt.cpan.org/Public/Bug/Display.html?id=70169
14:28		jihpringle joined #evergreen
14:32	Dyrcona	I have been told that this can happen if people copy and paste from Amazon when cataloging.
15:19	Dyrcona	Yeahp. GNU Emacs also says the file of bad records is UTF-16 when I open it.
15:20	* Dyrcona	wonders if pinesol has any dry kona in the coffee database.
15:24		sleary joined #evergreen
15:26	Dyrcona	It seems odd to me that a program using MARC::Record->new_from_usmarc can read these records, modify them, and write them out without issue, but another program using the same Perl module blows up. I suspect the writing out leads to a bad length in the LDR.
15:32	Dyrcona	oh!
15:33	Dyrcona	The original records display just fine in GNU Emacs..... They get mangled going through Perl.
15:34	Dyrcona	I see the curly apostrophes and quotes, and Emacs says the coding system is multi-byte UTF-8. Something has a double encoding problem with these characters.
15:35	Dyrcona	I wonder if I'll even be able to fix this in a reasonable manner?
15:36	Dyrcona	Perl's Unicode support is so broken....
15:42	Dyrcona	See... this is what I dislike about Unicode in Perl (particularly with MARC): one time I encode/decode the records and it works. Next time, the records come out garbled. Mebbe I should reread the Unicode FAQ and double check the MARC code to know what's really going on here.
16:33	pinesol	News from commits: Docs: global flags docs fixes <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=3f7b48566d3f34e07c5b7ba5b27ed23d97abd4b4>
16:44		jvwoolf left #evergreen
17:00		mmorgan left #evergreen

Results for 2022-12-09

14:52	Bmagic	mschell: you might be able to get away with a little regular expression in the OPAC to tease out the pieces that you need
15:02	Dyrcona	miker \| csharp_ : bug 1999274 I didn't link miker's branch because I thought you might like to add the bug # to the commit message.
15:02	pinesol	Launchpad bug 1999274 in Evergreen "Performance of Search on PostgreSQL Versions 12+" [Medium,Confirmed] https://launchpad.net/bugs/1999274
15:08	Dyrcona	mschell: You could use the 245 $c if you prefer, just alter the XPath expression. I'd like to add that some think that using MARC in the OPAC is a mistake and a future version of the OPAC may switch to using display fields.
15:08		sleary joined #evergreen
15:16	mschell	Bmagic, thanks I'll try that.
15:17	mschell	Dyrcona, that is a future version I look forward too :)

Result pages: 1 2 3 4 5 6 7