Tags · SEG / Tools / mt-varextract / mt-varextract-tool-webclient

Tags give the ability to mark specific points in history as being important

v1.1.4a

Release Notes on v1.1.4a --> this is the actual version 1.1.4!!:

Changes
- It is now possible to add a selection via the "enter" key and not only via pressing the "Add Selection" Button on the example marker view.
- The time indicator on the processing dialog now counts backwards in time.
- Pages, which have not been processed because they have too few characters on them, are now also watermarked as such in the results last run view.

Fixes:
- some minor visual bugs have been fixed.
- a bug, where the processing hangs because of a page in an invalid state has now been solved.
- when adding a selection on the examples marker view it added empty selections and selections only containing whitespace; also a selection could contain leading and trailing whitespace as well as double space between words. This has all been fixed and only non-empty and trimmed selections are added.

v1.1.4

34c1d81e · Merge branch 'wip/v1.1.3' into 'main' · Apr 13, 2023

Release Notes on v1.1.4:

Changes
- It is now possible to add a selection via the "enter" key and not only via pressing the "Add Selection" Button on the example marker view.
- The time indicator on the processing dialog now counts backwards in time.
- Pages, which have not been processed because they have too few characters on them, are now also watermarked as such in the results last run view.

Fixes:
- some minor visual bugs have been fixed.
- a bug, where the processing hangs because of a page in an invalid state has now been solved.
- when adding a selection on the examples marker view it added empty selections and selections only containing whitespace; also a selection could contain leading and trailing whitespace as well as double space between words. This has all been fixed and only non-empty and trimmed selections are added.

v1.1.3

34c1d81e · Merge branch 'wip/v1.1.3' into 'main' · Apr 13, 2023

Release Notes on v1.1.3:

What has changed?
- When during the processing operation there are failures due to the prompt being to large, the user is informed about that at the end of the processing and an advice is given to retry the pages with less examples specified.

v1.1.2

f7566cda · Merge branch 'wip/v1.1.2' into 'main' · Apr 13, 2023

release notes on v1.1.2:
what has changed:

Refactoring of code, which is responsible for doing the api calls. No added functionality.

v1.1.1

26141eb1 · Merge branch 'wip/v1.1.1' into 'main' · Apr 11, 2023

Release notes v1.1.1:

What has changed:
- Preprocessing has been improved: now the big screenshot string is removed after preprocessing leaving just a small thumbnail string.
- After uploading a file, the UI now waits until all pages have been fully screenshotted.

v1.1.0

0858fbe5 · Merge branch 'feat/v1.1.0-extensions-and-enhancements' into 'main' · Apr 05, 2023

Release notes on v1.1.0:

Changes:
- A new button on the toolbar has been added allowing for a "desk clearing", i.e. resetting the state of the tool and moving back to the upload view.
- In the consolidated results view, where the variables can be downloaded there is now also an option available to include page information, i.e. the text extracted on the page, the bare variables extracted, etc. in the downloaded file.
- Visual response on the upload view has been enhanced, after uploading a file (in case of successful validation) the UI displays now some form of visual response indicating that the file has been received. This increases confidence, that the tool is actually doing something after an upload.

v1.0.0

d01cf6b5 · Merge branch 'feat/v0.8.0-results-view' into 'main' · Mar 31, 2023

Release notes v1.0.0:

What is new?
The workflow is now completed: after performing inference on a given set of pages based on another given set of example pages, where the user has manually extracted the variables, the results are now displayed in the fresh views. First view shows the pages rendered and the variables extracted
highlighted from the last processing run. The second one shows all consolidated extracted variables in a lemmatized form. From there on, the user can also download the variables as json, simple text file or csv file.

What has changed?
Then this version brings also a big refactoring: a page is now considered a crucial element of the tool and is designed as a class based on a state machine with defined possible states and state transitions. This caused a major refactoring among how the background preprocessing and processing works as well as how selection of pages for either examples or to be processed work. The benefit out of that is the increased resilience of the different steps, easier background processing algorithms and a clearly defined behaviour of a page.
Then the overview view has been refactored drastically: now not the whole pdf is rendered but only a single page, but it is now possible to jump to a specific page, to select a specific page out of a page preview and to "scroll" through the pages as one is used by a usual pdf viewer.
Then there has been some improvements on the performance side: the preprocessing is now much faster as the page screenshots are now of reduced size, increasing the risk of faulty OCR but with a huge time boost. So this is something which can be further improved in the future. Also the memory footprint could be reduced by not caching any components anymore.
Finally, the ExperimentalView has been removed completely as well as the Home View has been adapted to display now some welcoming information. Also, the color palette has been changed, the tool comes now in a bluish tone. Making the major upgrade also visually.

v0.7.0

c57859d2 · Merge branch 'feat/v0.7.0-add-api-inference' into 'main' · Mar 24, 2023

Release Notes on v0.7.0

This version brings the "processing" step of the workflow: pages can now be selected for processing, i.e. sending to the backend api in order to inference the variables from the page text based on the specified example pages.
The processing view allows for page selection and a processing dialog will inform about the progress of the processing e.g. how many pages have been processed and how long it will approximately take until completion.
Besides those new features, the views, where pages are rendered as small thumbnail images have been refactored to gain performance and reduce resource consumption.

v0.6.0

cff25c9a · Merge branch 'feat/v0.6.0-variable-marking-for-examples' into 'main' · Mar 17, 2023

Release Notes on v0.6.0

This version completes the example definition step of the variable extraction workflow. Now it is possible to select the pages, which should serve the model as examples and also highlight and extract all the variables on those example pages. These example pages build the preamble of the model prompt, i.e. an important milestone is achieved with this version.

The variable extraction can be done with standard text selection or with manual writing into a text box, so the text highlight extraction can always be corrected afterwards. Also, all extracted variables are highlighted visually on the page, making it easier to recognise, which terms have already been selected.

Besides the implementation of the example selection / marking completion, the version brings also small speedups in performance: the render intensive example selection & overview views are now cached and served from the cache, whenever navigation leads back to them. Still, performance of the client stays an issue, especially for larger pdf files. This is considered as an improvement point for further versions.

v0.5.0

fccc3985 · Merge branch 'feat/v0.5.0-examples-first-part' into 'main' · Mar 11, 2023

Release Notes on v0.5.0

This version brings the first part of the examples selection & example variable marking view. It adds a new view after the overview view, which can be accessed with the "Examples" button. The added view displays all pages as thumbnails and allows a selection of them to be later used as the example pages for the prompt creation.
There has been also a refactoring of the background data stores: the document store has been split up into multiple stores; the pages are now passed after extraction to the preprocessing store, which performs the preprocessing, then to the processing store, which collects them to then do the final processing and passes the results to the results store. The latter is not yet implemented, also the processing logic in the processing store is not yet implemented but will come with future releases.
Also there are now some new utility classes available in the application, which ease some parts of the app.

The next version will bring support to mark variables on the selected example pages, which will complete the examples view.

v0.4.0

60d0ee64 · fix reference of wrong static folder in webserver · Mar 09, 2023

Release notes v0.4.0

v0.4.0 brings the overview view, where the uploaded pdf is displayed. This overview is the starting point for the extraction process, from there the user will be able to switch to any of the three steps of the process. Currently, none is implemented, but this will come in the next version.
Also, this version brings some improvements to the preprocessing of the uploaded pdf file: requests to the backend are now chained, s.t. just a certain amount of requests are open all the time. This amount is currently set to 1, i.e. just one request is done after the previous has been finished (successfully). This means, each page is OCRed after the previous one. This does not scale very well for large pdfs, but brings the benefit of supporting pdfs with any page number size.

Future versions will include the yet missing steps of the process.

v0.3.0

ceaa3d1e · Merge branch 'feat/v0.3.0-upload-a-pdf' into 'main' · Mar 04, 2023

The new feature introduced with v0.3.0 is a new Upload View, where a pdf file can be selected in order to upload to the tool. The client then takes the pdf and preprocesses it in the background, i.e. cuts the pdf into single pages, screenshot every page and send the screenshot to the backend api in order to perform OCR - text extraction.
Attention: Big files are not yet handled wisely and need an extreme amount of time to be preprocessed...

v0.2.0

7086ff27 · Merge branch 'feat/v0.2.0-add-basic-components' into 'main' · Feb 25, 2023

v0.2.0

This version brings the first draw of the tool look and feel. There is still not much functionality apart from some fancy looking ping buttons, which do all the same, i.e. pinging the backend and displaying some message to the console.
Still this version brings some valuable technical base work, which will enable future features to be implemented more easily.

v0.1.0

5d102ee7 · implement simple ping button · Feb 20, 2023

v0.1.0

Introduces a minimal running version of the mt-varextract-tool-webclient. Basically the vue.js scaffolding project with a custom button added, which
accesses the configured backend api's ping and displays the response.