I’m building a configuration portal for an AI platform/service that different teams at my organization will use. One of the key services involves OCR configuration, allowing users to:
- Classify a document
- Run an extraction
- Send the extracted JSON output back to the consumer
We are using Azure AI Document Intelligence (DI) as the underlying framework for this.
For custom extraction models, the simplest approach is to redirect users to Azure Document Intelligence Studio to build their models. Then, in our portal, we could use the control plane APIs to let users select their classification and extraction models as needed. That’s likely our first version.
However, I’m exploring how we could integrate the extraction labeling and model training directly into our portal, so users wouldn’t need to visit the Azure portal at all.
What I've Found So Far:
- Azure's Labeling Tool (Docs) seems to be deprecated, so it's not a viable option.
- REST APIs & SDKs (e.g.,
DocumentIntelligenceAdministrationClient
) allow programmatic model training, but they don’t include a UI for labeling data. - I also found this article reviewing the questions StackOverflow flagged for me to review. Since this is an internal tool I would probably just give the people using the labeling tool access to our resource and use Azure's portal directly. This would be more viable if we ever wanted to expose this to external users. Also, it's still not really what I'm looking for which is the ability to have a UI that creates a tag format compatible with the DI training apis.
The Core Question:
Are there any React/TypeScript libraries or patterns for implementing a document labeling UI similar to Azure DI Studio?
Specifically, I’m looking for ways to:
- Display PDFs/images for annotation
- Allow users to draw bounding boxes around key-value pairs
- Store annotations in a format compatible with Azure Document Intelligence’s Train Model API
Has anyone successfully built a similar workflow? What tools or frameworks did you use?
I’m building a configuration portal for an AI platform/service that different teams at my organization will use. One of the key services involves OCR configuration, allowing users to:
- Classify a document
- Run an extraction
- Send the extracted JSON output back to the consumer
We are using Azure AI Document Intelligence (DI) as the underlying framework for this.
For custom extraction models, the simplest approach is to redirect users to Azure Document Intelligence Studio to build their models. Then, in our portal, we could use the control plane APIs to let users select their classification and extraction models as needed. That’s likely our first version.
However, I’m exploring how we could integrate the extraction labeling and model training directly into our portal, so users wouldn’t need to visit the Azure portal at all.
What I've Found So Far:
- Azure's Labeling Tool (Docs) seems to be deprecated, so it's not a viable option.
- REST APIs & SDKs (e.g.,
DocumentIntelligenceAdministrationClient
) allow programmatic model training, but they don’t include a UI for labeling data. - I also found this article reviewing the questions StackOverflow flagged for me to review. Since this is an internal tool I would probably just give the people using the labeling tool access to our resource and use Azure's portal directly. This would be more viable if we ever wanted to expose this to external users. Also, it's still not really what I'm looking for which is the ability to have a UI that creates a tag format compatible with the DI training apis.
The Core Question:
Are there any React/TypeScript libraries or patterns for implementing a document labeling UI similar to Azure DI Studio?
Specifically, I’m looking for ways to:
- Display PDFs/images for annotation
- Allow users to draw bounding boxes around key-value pairs
- Store annotations in a format compatible with Azure Document Intelligence’s Train Model API
Has anyone successfully built a similar workflow? What tools or frameworks did you use?
Share Improve this question asked Feb 5 at 18:12 Jordan DantasJordan Dantas 711 gold badge1 silver badge6 bronze badges 2- Refer this doc Azure AI Document Intelligence client library for JavaScript – Sampath Commented Feb 6 at 3:35
- That looks like it might be helpful addition to what I am looking for. I am looking for a component/pattern for how to actually annotate documents (visually) and then send the json/output of that to DI for training. I think the doc you mentioned would be necessary for bundling up the annotations in a valid JSON format, but I would still need an interface for the user to do the annotations – Jordan Dantas Commented Feb 6 at 22:05
1 Answer
Reset to default 0Create a user interface for document annotation that integrates seamlessly with Azure Document Intelligence using a React app. You can use react-pdf-highlighter
or react-pdf
to load and display the document. Additionally, use react-annotation
or react-draw
to allow users to draw bounding boxes and other annotations.
Refer to this Microsoft documentation for integrating Azure Document Intelligence with JavaScript.
Below is a sample code snippet to display a PDF and Bounding Box Drawing for Fields using react-pdf-highlighter
:
import React, { useState, useEffect, useCallback, useRef } from "react";
import {
AreaHighlight,
Highlight,
PdfHighlighter,
PdfLoader,
Popup,
Tip,
} from "react-pdf-highlighter";
import type {
Content,
IHighlight,
NewHighlight,
ScaledPosition,
} from "react-pdf-highlighter";
import { Sidebar } from "./Sidebar";
import { Spinner } from "./Spinner";
import { testHighlights as _testHighlights } from "./test-highlights";
import "./style/App.css";
import "../../dist/style.css";
import { DocumentIntelligence } from "@azure-rest/ai-document-intelligence";
import { getLongRunningPoller, isUnexpected } from "@azure-rest/ai-document-intelligence";
import { AzureKeyCredential } from "@azure/core-auth";
const testHighlights: Record<string, Array<IHighlight>> = _testHighlights;
const getNextId = () => String(Math.random()).slice(2);
const parseIdFromHash = () => document.location.hash.slice("#highlight-".length);
const resetHash = () => { document.location.hash = ""; };
const PRIMARY_PDF_URL = "https://arxiv.org/pdf/1708.08021";
const SECONDARY_PDF_URL = "https://arxiv.org/pdf/1604.02480";
const key = "<your-key>";
const endpoint = "<your-endpoint>";
export default function App() {
const searchParams = new URLSearchParams(document.location.search);
const initialUrl = searchParams.get("url") || PRIMARY_PDF_URL;
const [url, setUrl] = useState(initialUrl);
const [highlights, setHighlights] = useState<Array<IHighlight>>(
testHighlights[initialUrl] ? [...testHighlights[initialUrl]] : []
);
const scrollViewerTo = useRef((highlight: IHighlight) => {});
const resetHighlights = () => setHighlights([]);
const toggleDocument = () => {
const newUrl = url === PRIMARY_PDF_URL ? SECONDARY_PDF_URL : PRIMARY_PDF_URL;
setUrl(newUrl);
setHighlights(testHighlights[newUrl] ? [...testHighlights[newUrl]] : []);
};
const scrollToHighlightFromHash = useCallback(() => {
const highlight = highlights.find((h) => h.id === parseIdFromHash());
if (highlight) scrollViewerTo.current(highlight);
}, [highlights]);
useEffect(() => {
window.addEventListener("hashchange", scrollToHighlightFromHash, false);
return () => window.removeEventListener("hashchange", scrollToHighlightFromHash, false);
}, [scrollToHighlightFromHash]);
const addHighlight = (highlight: NewHighlight) => {
setHighlights((prev) => [{ ...highlight, id: getNextId() }, ...prev]);
};
const updateHighlight = (highlightId: string, position: Partial<ScaledPosition>, content: Partial<Content>) => {
setHighlights((prev) =>
prev.map((h) =>
h.id === highlightId ? { ...h, position: { ...h.position, ...position }, content: { ...h.content, ...content } } : h
)
);
};
const analyzeDocument = async (docUrl) => {
const client = DocumentIntelligence(endpoint, new AzureKeyCredential(key));
const initialResponse = await client
.path("/documentModels/{modelId}:analyze", "prebuilt-layout")
.post({ contentType: "application/json", body: { urlSource: docUrl } });
if (isUnexpected(initialResponse)) throw initialResponse.body.error;
const poller = await getLongRunningPoller(client, initialResponse);
const analyzeResult = (await poller.pollUntilDone()).body.analyzeResult;
console.log("Extracted document:", analyzeResult);
};
return (
<div className="App" style={{ display: "flex", height: "100vh" }}>
<Sidebar highlights={highlights} resetHighlights={resetHighlights} toggleDocument={toggleDocument} />
<div style={{ height: "100vh", width: "75vw", position: "relative" }}>
<PdfLoader url={url} beforeLoad={<Spinner />}>
{(pdfDocument) => (
<PdfHighlighter
pdfDocument={pdfDocument}
enableAreaSelection={(event) => event.altKey}
onScrollChange={resetHash}
scrollRef={(scrollTo) => {
scrollViewerTo.current = scrollTo;
scrollToHighlightFromHash();
}}
onSelectionFinished={(position, content, hideTipAndSelection, transformSelection) => (
<Tip
onOpen={transformSelection}
onConfirm={(comment) => {
addHighlight({ content, position, comment });
hideTipAndSelection();
}}
/>
)}
highlightTransform={(highlight, index, setTip, hideTip, viewportToScaled, screenshot, isScrolledTo) => (
<Popup
popupContent={<div className="Highlight__popup">{highlight.comment?.emoji} {highlight.comment?.text}</div>}
onMouseOver={(popupContent) => setTip(highlight, () => popupContent)}
onMouseOut={hideTip}
key={index}
>
{highlight.content?.image ? (
<AreaHighlight
isScrolledTo={isScrolledTo}
highlight={highlight}
onChange={(boundingRect) =>
updateHighlight(highlight.id, { boundingRect: viewportToScaled(boundingRect) }, { image: screenshot(boundingRect) })
}
/>
) : (
<Highlight isScrolledTo={isScrolledTo} position={highlight.position} comment={highlight.comment} />
)}
</Popup>
)}
highlights={highlights}
/>
)}
</PdfLoader>
</div>
</div>
);
}
For more details, refer to this documentation on react-annotation
and other related packages.