Wollongong's network of public libraries, council archives, and university repositories is sitting on tens of thousands of duplicate digital images — redundant files accumulated over more than two decades of piecemeal digitisation projects that were never properly coordinated. The problem, long acknowledged quietly by archivists, is now forcing a formal response from institutions including Wollongong City Library on Crown Street and the University of Wollongong's Special Collections unit in the Bragg Building on Northfields Avenue.
The timing matters. Three converging pressures have pushed the issue from back-office headache to genuine public concern. Illawarra's cultural sector has been angling for a larger slice of the Illawarra Shoalhaven Regional Development Fund, which the NSW Government has used to channel money into post-industrial economic diversification across the region. Submitting credible, well-managed digital collections is a prerequisite for several grant streams. At the same time, BlueScope Steel's ongoing transition planning at Port Kembla has prompted heritage bodies to accelerate digitisation of industrial photography going back to the 1920s — creating fresh opportunities for duplication. And UOW's Faculty of Law, Humanities and the Arts has been expanding its community history programs, generating a new wave of donated digital material from local families and sporting clubs.
How the Backlog Built Up
The roots of the problem go back to roughly 2003, when Wollongong City Council first began scanning physical photograph collections held at its Burelli Street heritage centre. Technology was expensive, standards were inconsistent, and different digitisation rounds used different file-naming conventions, resolutions, and metadata schemas. When a second round of scanning was funded under a separate Commonwealth grant program in 2011, nobody systematically cross-checked what already existed. The same photographs — particularly images of the Illawarra coastline, the Flagstaff Hill precinct, and the old BHP steelworks — were scanned multiple times by multiple agencies working in parallel.
Community digitisation drives compounded the issue. The Wollongong region has at least a dozen active local history groups, from the Illawarra Historical Society based in the CBD to smaller bodies in Helensburgh, Dapto, and Kiama. Many donated digital copies of photographs to more than one institution simultaneously. Without a shared catalogue or deduplication protocol, each institution accepted the files and stored them independently. Estimates within the sector — based on internal audits that have not been publicly released — suggest duplicate or near-duplicate files may account for between 20 and 35 per cent of some collections, though that range should be treated cautiously given methodological differences in how institutions count files.
What Replacement Actually Means
Duplicate image replacement, in archival practice, is not simply deleting copies. It involves identifying canonical versions — usually the highest-resolution original scan — assigning persistent identifiers, redirecting or retiring lower-quality duplicates, and updating every catalogue record that pointed to the old file path. For a mid-sized institution, that process can run to hundreds of hours of skilled labour. At current archival contractor rates in NSW, which have risen sharply since 2023 alongside broader public sector wage pressures, a full deduplication project for a collection of 50,000 files can cost upward of $80,000.
UOW's Special Collections team has been piloting an open-source deduplication tool called DupeGuru across a subset of its photographic holdings since late 2025, according to publicly available project documentation on the university's library website. The pilot covered approximately 12,000 image files drawn from the Noel Butlin donation and the Wollongong Trades and Labour Council collection. Results from that pilot are expected to inform a broader policy recommendation to the Council of Australian University Librarians later this year.
For the public, the practical upshot is straightforward: search results on Wollongong City Council's online heritage portal and UOW's digital repository are likely to improve markedly once deduplication is complete, with fewer redundant results cluttering queries. Researchers working on projects related to Port Kembla's industrial history — an area of growing academic and community interest as the green steel transition reshapes the suburb — will find it easier to locate authoritative versions of key images. The institutions involved have signalled they expect a phased rollout through 2026 and into 2027, with Crown Street library's collections prioritised first.