News
Wollongong's Digital Archive Push Hits a Wall Over Duplicate Image Problem
A quiet but consequential data headache has slowed the City of Wollongong's push to digitise historical records and heritage photography this week.
3 min read
News
A quiet but consequential data headache has slowed the City of Wollongong's push to digitise historical records and heritage photography this week.
3 min read
The City of Wollongong's drive to build a publicly accessible digital archive of the Illawarra region's industrial and cultural history has stalled this week over a persistent problem its information management team has been working to resolve: thousands of duplicate images clogging the system and making reliable public access impossible.
The issue surfaced publicly after the Wollongong City Library on Crown Street flagged delays to a planned update of its online heritage collection portal, which draws on photographic records dating back to the late 19th century. Duplicate image files — in some cases the same photograph stored four or five times under different filenames — have made it difficult for library staff to verify which version of a record is the authoritative one before publishing it to the public-facing catalogue.
The timing is awkward. Wollongong City Council allocated funding in its 2025–26 budget to accelerate digitisation of Illawarra heritage materials, with a target of making a new tranche of the collection available online before the end of the financial year on June 30. That deadline has now passed without the planned update going live. The library's digital services team is understood to be working through a deduplication process before any further material is published, though the scope of the backlog has not been formally quantified in any public statement from Council.
The problem is not unique to Wollongong. Institutions across New South Wales have wrestled with duplicate-image bloat as older scanning programs — many conducted in the 2000s and early 2010s without consistent file-naming conventions — were merged into newer content management systems. The State Library of NSW flagged similar catalogue integrity challenges during its own platform migration in 2023. For a regional archive like Wollongong's, which lacks the dedicated technical staff of a metropolitan institution, the clean-up work falls to a smaller team juggling competing priorities.
The Heritage Collections held at the Wollongong City Library include photographs of Port Kembla's steelworks dating from the 1920s, images of the former Bulli collieries, and street photography from Crown Street's commercial district across multiple decades. These records carry practical value beyond nostalgia: BlueScope Steel and researchers at the University of Wollongong's Faculty of Engineering and Information Sciences have drawn on the archive for heritage impact assessments related to the Port Kembla industrial precinct, where planning for a renewable energy zone is now well advanced.
Resolving duplicate images requires more than simply deleting copies. Each file may carry different metadata — scan dates, donor attribution, condition notes — meaning a raw deletion risks destroying information attached to what appears to be a redundant file. Standard practice involves comparing files algorithmically to identify exact and near-exact matches, then manually reviewing flagged pairs to confirm which record to retain and which to merge or archive. For large collections, specialist software can accelerate the comparison stage, but the review stage remains labour-intensive.
The University of Wollongong's Library, based on Northfields Avenue in Keiraville, runs its own digital collections infrastructure separately from the City Library and has not reported the same problem. The two institutions have previously collaborated on Illawarra regional history projects, and there is scope for that partnership to extend to technical assistance on the current backlog, though no formal arrangement of that kind has been announced.
For residents and researchers who rely on the Crown Street library's online heritage portal, the practical upshot is continued delays to new material appearing in the catalogue. The library's physical collection remains fully accessible during normal opening hours. Anyone with specific research needs around Illawarra heritage photography is advised to contact the library's local studies team directly rather than waiting for the online catalogue to update. Council has not set a revised public date for the portal refresh, but the deduplication work was described in internal communications as a priority task for the current quarter ending September 30, 2026.
Spread the word
About this article
Published by The Daily Wollongong
Daily brief
Free, in your inbox before 7am. Weekdays.
Stay in the loop