View Issue Details

IDProjectCategoryView StatusLast Update
0000602filegeneralpublic2017-03-24 15:32
ReporterGalen Charlton 
Assigned ToChristos Zoulas 
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
Product Version5.30 
Target VersionFixed in VersionHEAD 
Summary0000602: make MARC21 file detection stricter
DescriptionThe attached patch enables distinguishing between a MARC file and the output of yaz-marcdump, a tool commonly used to make human-readable dumps of MARC records.
TagsNo tags attached.

Relationships

Activities

Galen Charlton

Galen Charlton

2017-03-23 15:43

reporter  

stricter-marc21-detection.patch (3,985 bytes)
From 798c86903e6d8e1d662df278fbb1ae35bd200ee2 Mon Sep 17 00:00:00 2001
From: Galen Charlton <gmcharlt@gmail.com>
Date: Thu, 23 Mar 2017 15:34:34 +0000
Subject: [PATCH] make MARC21 check stricter

Distinguish between MARC files, which must contain at least one
field terminator (\x1e) character, from the output of a tool
commonly used to produce human-readable display versions of MARC
files.
---
 magic/Magdir/marc21           |  4 +++-
 tests/marc21-ismarc.result    |  1 +
 tests/marc21-ismarc.testfile  |  1 +
 tests/marc21-notmarc.result   |  1 +
 tests/marc21-notmarc.testfile | 21 +++++++++++++++++++++
 5 files changed, 27 insertions(+), 1 deletion(-)
 create mode 100644 tests/marc21-ismarc.result
 create mode 100644 tests/marc21-ismarc.testfile
 create mode 100644 tests/marc21-notmarc.result
 create mode 100644 tests/marc21-notmarc.testfile

diff --git a/magic/Magdir/marc21 b/magic/Magdir/marc21
index 6819a8d..14b3f45 100644
--- a/magic/Magdir/marc21
+++ b/magic/Magdir/marc21
@@ -9,7 +9,9 @@
 
 
 # leader position 20-21 must be 45
-20	string	45
+20	string	    45
+# ... and a field terminator character should be present
+0	search/2048 \x1e
 
 # leader starts with 5 digits, followed by codes specific to MARC format
 >0	regex/1l	(^[0-9]{5})[acdnp][^bhlnqsu-z]	MARC21 Bibliographic
diff --git a/tests/marc21-ismarc.result b/tests/marc21-ismarc.result
new file mode 100644
index 0000000..fc59a13
--- /dev/null
+++ b/tests/marc21-ismarc.result
@@ -0,0 +1 @@
+MARC21 Bibliographic
\ No newline at end of file
diff --git a/tests/marc21-ismarc.testfile b/tests/marc21-ismarc.testfile
new file mode 100644
index 0000000..b59fc30
--- /dev/null
+++ b/tests/marc21-ismarc.testfile
@@ -0,0 +1 @@
+00962njm  2200253 a 450000100110000000500170001100800410002803500200006905000090008910000240009824500460012226000380016830000360020634900200024250501340026251101280039665000330052470000150055770000170057270000190058970000200060894900740062859600060070203-001813720020121132036.0020121s1999    cauuuu              eng d  a(Sirsi) a353305  a5490  aLovano, Joe,d1952-  aFriendly fire /cJoe Lovano and Greg Osby  aHollywood, CA :bBlue Note,c1999  a1 sound disc :bdigital, stereo  aCOMPACT DISC(S)  aGeo J Lo -- The wild east -- Serene -- Broad Way blues -- Monk's mood -- Idris -- Truth be told -- Silenos -- Alexander the Great  aJoe Lovano, saxophones and flute ; Greg Osby, saxophones ; Jason Moran, piano ; Cameron Brown, bass ; Idris Muhammad, drums 0aSaxophone with jazz ensemble  aOsby, Greg  aMoran, Jason  aBrown, Cameron  aMuhammad, Idris  a5490wLCi30007007467409rYtSOUNDlHUNT-AUDIOmHUNTINGTONxCOMPDISCS  a1
\ No newline at end of file
diff --git a/tests/marc21-notmarc.result b/tests/marc21-notmarc.result
new file mode 100644
index 0000000..9096549
--- /dev/null
+++ b/tests/marc21-notmarc.result
@@ -0,0 +1 @@
+ASCII text
\ No newline at end of file
diff --git a/tests/marc21-notmarc.testfile b/tests/marc21-notmarc.testfile
new file mode 100644
index 0000000..98af893
--- /dev/null
+++ b/tests/marc21-notmarc.testfile
@@ -0,0 +1,21 @@
+00962njm  2200253 a 4500
+001 03-0018137
+005 20020121132036.0
+008 020121s1999    cauuuu              eng d
+035    $a (Sirsi) a353305
+050    $a 5490
+100    $a Lovano, Joe, $d 1952-
+245    $a Friendly fire / $c Joe Lovano and Greg Osby
+260    $a Hollywood, CA : $b Blue Note, $c 1999
+300    $a 1 sound disc : $b digital, stereo
+349    $a COMPACT DISC(S)
+505    $a Geo J Lo -- The wild east -- Serene -- Broad Way blues -- Monk's mood -- Idris -- Truth be told -- Silenos -- Alexander the Great
+511    $a Joe Lovano, saxophones and flute ; Greg Osby, saxophones ; Jason Moran, piano ; Cameron Brown, bass ; Idris Muhammad, drums
+650  0 $a Saxophone with jazz ensemble
+700    $a Osby, Greg
+700    $a Moran, Jason
+700    $a Brown, Cameron
+700    $a Muhammad, Idris
+949    $a 5490 $w LC $i 30007007467409 $r Y $t SOUND $l HUNT-AUDIO $m HUNTINGTON $x COMPDISCS
+596    $a 1
+
Christos Zoulas

Christos Zoulas

2017-03-24 15:32

manager   ~0001486

Added the search for field terminator.

Issue History

Date Modified Username Field Change
2017-03-23 15:43 Galen Charlton New Issue
2017-03-23 15:43 Galen Charlton File Added: stricter-marc21-detection.patch
2017-03-24 15:31 Christos Zoulas Assigned To => Christos Zoulas
2017-03-24 15:31 Christos Zoulas Status new => assigned
2017-03-24 15:32 Christos Zoulas Status assigned => resolved
2017-03-24 15:32 Christos Zoulas Resolution open => fixed
2017-03-24 15:32 Christos Zoulas Fixed in Version => HEAD
2017-03-24 15:32 Christos Zoulas Note Added: 0001486