Digitizing Việt Nam marks a digital leap forward in Vietnam Studies through a Columbia - Fulbright collaboration, formalized through that began with a 2022 memorandum of understanding between the Weatherhead East Asian Institute and the Vietnam Studies Center. The Digitizing Việt Nam platform began with the generous donation of the complete archive by the Vietnamese Nôm Preservation Foundation to Columbia University in 2018.




Delve into Vietnam's history, culture, and society through cutting-edge tools and curated resources tailored for scholars, students, and educators.
Explore our digital archive dedicated to preserving and academically exploring Vietnam's historical, cultural & intellectual heritage.
Engage creatively with Vietnam Studies — Use Digitizing Vietnam's specialized tools to approach the field with fresh perspectives and critical insight.
Discover and teach Vietnam Studies with impact — Explore curated syllabi, lesson plans, and multimedia resources designed to support innovative and inclusive learning experiences.
Latest news and discoveries from the digital front of Vietnamese heritage.

Queer Vietnam: A History of Gender Transgression, 1920–1945 by Richard Quang-Anh Tran reexamines the cultural history of late French colonial Vietnam through the lens of gender and sexual variance. Focusing on the interwar period (1920–1945), the book challenges the dominant assumption that Western imperial modernity uniformly narrowed and pathologized non-normative genders and sexualities across Asia. In Vietnam, Tran argues, the picture was more complex and uneven.
Opening with a reading of Khái Hưng’s serialized novella Hồn Bướm Mơ Tiên (Butterfly Soul Dreaming of an Immortal), the book situates modern literary experimentation alongside much older narrative traditions. The story of a young man who falls in love with a monk later revealed to be a woman echoes the premodern Buddhist tale of Quan Âm Thị Kính, in which a woman lives as a monk and whose biological sex is only discovered after death. By placing these texts in dialogue, Tran shows that cross-dressing and gender transformation were not marginal anomalies but recurring cultural motifs. Such narratives raised persistent questions about the constitution, limits, and instability of gender in Vietnamese thought.
Rather than reading these stories solely as affirmations of binary gender norms, Tran demonstrates that they can also be understood as explorations of erotic ambiguity and gendered embodiment. The distinction between desire for the monk and desire for the woman, for example, remains unsettled, suggesting that same-sex desire was thinkable—even if not named in modern identity terms. Across short stories, poetry, reform theater, urban reportage, and journalistic accounts, hundreds of narratives involving gender-crossing and sexual variance circulated in the public sphere of interwar Vietnam.
The broader historical context is crucial. The late nineteenth and early twentieth centuries were marked by colonial transformation, the rise of capitalism, new state systems, and the spread of modern science. In many parts of Asia, these forces, coupled with Western sexology, reorganized sexual norms and delegitimized previously tolerated practices. Yet in Vietnam, Sino-Vietnamese and Southeast Asian traditions that allowed greater flexibility in gender and sexual expression persisted well into the twentieth century. While European discourses on sex and gender entered Vietnamese culture, sexology’s pathologizing framework had not yet become the dominant interpretive lens.
Tran thus argues that a far more capacious vision of gendered personhood existed in this period than scholars have previously assumed. Importantly, the book also clarifies its use of the term “queer.” Rather than simply mapping contemporary LGBT identities onto the past, Queer Vietnam employs “queer” to describe historical subjects who departed from the ideological fiction that aligns biological sex, gender expression, desire, and social role into neat binary oppositions. By examining moments when these alignments fractured, the book rethinks both the periodization of Vietnamese modernity and the conceptualization of sexual modernity more broadly.
In bringing queer figures from the margins of historiography to the center of Vietnam’s cultural archive, Queer Vietnam opens new directions for understanding colonial modernity, literary innovation, and the dynamic plasticity of gender in Vietnamese history.
Read more about the book on Stanford University Press's website.

As Artificial Intelligence (AI) is increasingly discussed in the archival field, the question is no longer “Should we use AI?” but rather “How should we prepare archival data so that AI does not undermine the foundational principles of archival practice?”
In February 2026, Prof. Giovanni Colavizza and Prof. Lise Jaillant published the AI Preparedness Guidelines for Archivists, issued by the Archives & Records Association (UK & Ireland) under a CC BY license. In this document, the authors emphasize a central point: AI is only truly useful when archival collections are carefully prepared in terms of data, metadata, structure, and evaluation mechanisms.
Below are the key elements of the guidelines that may be of particular interest to the archival community in Vietnam.
The guidelines distinguish between two main types of AI models commonly applied in archives:
1. Task-Specific AI
These models are trained to perform clearly defined tasks, such as:
2. Generative AI
These models generate language and can:
An important approach highlighted in the guidelines is RAG (Retrieval-Augmented Generation). In this model, the AI system first retrieves relevant material from a well-prepared collection, and only then generates content based on the retrieved data. This approach helps reduce “hallucinations” (AI generating information not present in the source material) and improves accuracy.
1. Completeness and Excluded Data
It is not necessary to digitize 100% of a collection in order to apply AI. However, it is essential to:
This is especially important for Generative AI, since AI can only reflect what is present in the data.
2. Metadata and Access Conditions
AI cannot function effectively if metadata is incomplete or fragmented.
It is necessary to ensure:
The guidelines particularly emphasize the value of narrative metadata, such as curatorial notes, historical context, and critical analysis. These elements help AI systems better understand cultural depth, power dynamics, and layered meanings within archival materials.
3. Data Formats and File Structure
Preparing data for AI does not mean “cleaning” it in ways that disrupt the original archival structure.
Instead, it is important to:
This is particularly relevant for systems using IIIF, OCR pipelines, or databases integrated with vector search technologies.
4. Application-Specific Evaluation
Each AI application requires its own set of evaluation metrics, rather than relying on generic criteria. For example:
Defining evaluation methods from the outset helps ensure that AI delivers practical value rather than functioning merely as a technological experiment.
Before launching an AI project, you should be able to answer “yes” to most of the following questions:
The most important step is not deploying another tool, but investing in AI data preparedness.
Preparing data for AI is essentially an extension of the core principles of archival practice: thorough documentation, preservation of context, structural transparency, and professional accountability. When this foundation is strong, AI can become a supportive tool rather than a force that compromises the value and integrity of archival collections.

In her article “The Cost of Open by Default in the AI Era: Can We Protect Donor Materials from Generative AI?” (30 January 2026), Rosalyn Metz, Chief Technology Officer for Libraries and Museums at Emory University, raises a foundational question for archives and cultural heritage institutions:
When generative AI can collect, synthesize, and commercialize data at unprecedented scale, is the “open by default” model still sustainable?
Metz’s article is not merely a reflection on technology; it is a warning about a fundamental shift in the digital knowledge ecosystem.
According to Metz, archival institutions are currently facing four major pressures:
1. Harvesting Knowledge Frameworks
AI companies are not only collecting content; they are also extracting the classification systems, annotations, and knowledge structures built over decades. As Metz writes, they are not merely “scraping content,” but exploiting—often for profit—the frameworks institutions have created to describe, relate, and interpret that content (“the frameworks we have built to describe, relate, and explain the content”).
2. The “Infrastructure Tax”
Data-harvesting bots can overload systems and disrupt access for legitimate users. Metz describes this as a form of attack, where bots consume so many system resources that human users are slowed down or locked out entirely—a “denial-of-service attack against our human users.”
Libraries are effectively paying an “infrastructure tax” to maintain open access in the face of such activity.
3. Harvesting Physical Collections
The revival of large-scale digitization initiatives such as Google Books raises another concern: companies do not use data just once. They return whenever they build new models. Meanwhile, libraries typically receive only a one-time payment.
4. Erosion of Trust
Perhaps Metz’s most urgent concern is the erosion of trust between donors and archival institutions.
When donors give their complete works or collections, they expect protection. Yet today, as Metz acknowledges, institutions cannot provide absolute guarantees that materials will not be scraped, ingested, and commercialized by AI systems.
Metz analyzes new contractual clauses requiring institutions to prevent:
While contracts may permit internal AI uses (such as OCR or metadata generation), the larger question remains: How can these terms be enforced when the internet remains an “open buffet” of content? Metz argues that strict compliance leaves only two options:
Both solutions run counter to the open-access ethos heritage institutions have championed for decades.
Metz points to emerging standards such as:
These aim to create machine-enforceable mechanisms to restrict or monetize bot access. However, until major legal cases—such as The New York Times Company v. Microsoft Corporation—reach final rulings, the legal boundaries governing AI training on public content remain unsettled.
For Digitizing Việt Nam, the questions Metz raises are especially urgent. The project is building digital infrastructure to expand access to materials related to Vietnam. Yet alongside open access comes an ethical responsibility to donors, authors, and communities.
“Open” cannot mean “implicitly available for unlimited extraction.”
Rosalyn Metz’s article reminds us that:
In the AI era, the question is no longer whether we should open access, but: How do we open responsibly while still protecting knowledge and the people who entrusted it to us?