Microsoft’s experimental SpreadsheetLLM software helps AI better understand spreadsheets

Researchers at Microsoft Corp. today released details of an experimental artificial intelligence model called SpreadsheetLLM, and as the name suggests, it’s designed to work with spreadsheets such as Excel and Google Sheets.

The model, mentioned in an article dated July 12 research paper published on Arxiv.org, aims to solve the challenges of applying AI to spreadsheets, which are widely used in the business world but have proven difficult for large language models to master.

According to Microsoft researchers, SpreadsheetLLM uses a new approach to encode spreadsheet content into a new format that LLMs can work with more easily. Thus, it paves the way for these models to “reason about spreadsheet content.”

The researchers highlighted the critical need for improvements in this particular area of ​​AI. Spreadsheets are used for a wide range of tasks, from simple data entry and analysis to complex financial modeling and decision-making. But existing LLMs struggle to understand and reason on the contents of spreadsheets. The problem is related to the highly structured nature of the data they contain and the presence of complex formulas and references.

It seems that SpreadsheetLLM gets around this problem by encoding spreadsheet data in a more LLM-friendly way, making it easier to understand.

To do this, the researchers developed a new encoding mechanism called SheetCompressor that preserves the structure and relationships of the data, while making it accessible to LLMs. In particular, SheetCompressor compresses data by up to 96%, allowing LLMs to handle large data sets within the limits of their tokens.

The researchers also highlighted another feature, called “structural anchor extraction,” which identifies key rows and columns that define table structures. Meanwhile, “inverted index translation” is a method that efficiently encodes cell contents and addresses to minimize redundancy, while “format-aware aggregation” groups cells with similar formats together to further minimize token usage.

In their experiments, the researchers found that SpreadsheetLLM achieved impressive results in a spreadsheet array detection test, outperforming existing methods by 12.3%. Additionally, it performed excellently in spreadsheet question answering tasks.

SpreadSheetLLM was applied to a range of well-known LLMs, including GPT-3.5, GPT-4, and Llama 2, and testing showed that it significantly improved the ability of these models to understand spreadsheet tasks. For example, GPT-4 achieved a table detection score of 78.9%.

The researchers said SpreadsheetLLM is still an experimental model and has some limitations when it comes to more complex spreadsheet formats, but they also think it has a lot of potential. For example, they say it could be applied to tasks like automating routine data analysis to generate insights and recommendations based on the contents of spreadsheets. By helping LLMs understand spreadsheets, answer questions about them, and even create new ones based on natural language prompts, it opens the door to new possibilities in AI-assisted data analysis and decision-making.

SpreadsheetLLM could also help make spreadsheets more accessible to human workers, many of whom struggle to master the more complex features of tools like Excel. One of the challenges of working with spreadsheets is the need to learn complex formulas to manipulate the data they contain. But SpreadsheetLLM could help users manipulate that data using natural language commands.

Finally, the researchers say, SpreadsheetLLM could help automate some of the more tedious tasks associated with spreadsheets, such as cleaning, formatting and aggregating data.

The study is important because so much of the world’s business relies on Excel spreadsheets, said Holger Mueller, an analyst at Constellation Research Inc. “It’s critical for Microsoft to be at the forefront of this effort to make Excel spreadsheets more accessible through AI,” he said. “Verbal access to spreadsheets provides tremendous value, both for creating and analyzing Excel files.”

Mueller also said AI has the potential to democratize the use of spreadsheets by making them easy for everyone to use. “If Microsoft gets this right, it will not only secure the future of Excel, but it will also change the future of work as we know it,” he predicted.

For now, SpreadsheetLLM is just a research project, and Microsoft has not yet indicated whether it plans to turn it into a real product. But it’s not hard to imagine that some kind of “Copilot for Excel” could emerge from this research.

Main image: SiliconANGLE/Microsoft Designer

Your vote of support is important to us and helps us keep the content FREE.

Clicking below supports our mission to provide free, in-depth and relevant content.

Join our community on YouTube

Join the community of over 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, ​​Dell Technologies Founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner for the industry. You are truly a part of our events and we really appreciate your presence. I know people also appreciate the content you create” – Andy Jassy

THANK YOU

Leave a Comment