Streamlit for Snowflake
Bridging the gap between data complexity and accessibility
Snowflake stands at the forefront of data warehousing, boasting an unrivaled platform renowned for its unmatched scalability and performance. The Snowflake user interface (UI) caters to data professionals, offering a straightforward and uncluttered space for executing queries, inspecting data, and implementing updates. However, for business users who may not possess a strong command of SQL, the web UI can present a formidable obstacle.
Recently, we were presented with a challenge - to design an interface that would empower our client to rectify data quality issues in their raw data without the prerequisite of SQL expertise. This endeavor called for a user-friendly solution that would bridge the gap between data complexity and accessibility, allowing for seamless data quality enhancements.
The Problem
SharkSmart is a NSW Government program that tags and tracks White, Tiger and Bull Sharks along the NSW coast. Sharks are caught using Smart Drumlines which notify a team of experts who can respond within minutes. The Sharks are tagged with an acoustic tag and a range of measurements are taken, before releasing the shark. The information collected during the tagging process is recorded and pushed into a Snowflake data warehouse for ongoing research purposes.
As you can imagine the open ocean environment is not your typical data entry scenario and given the highly dynamic location, data entry errors do occur. Researchers review the new data on a daily basis to identify data errors and correct them within Snowflake. The researchers do not have SQL skills and needed a simple interface to be able to review and correct the data.
The Solution
Streamlit within Snowflake provides the perfect solution. What sets this solution apart is its unique hosting within Snowflake, simplifying deployment and management to an unprecedented degree. The Streamlit app seamlessly interfaces with Snowflake, taking advantage of Snowflake's robust architecture to provide a holistic data management solution.
One of the standout features is the streamlined security integration. By sitting on top of Snowflake, Streamlit inherits the robust security protocols in place, requiring little to no additional administrative overhead. This means that sensitive data from the shark tagging process remains highly secure and within Snowflake at all times.
What's truly remarkable is how easy it is to build a user-friendly interface with Streamlit. Streamlit's intuitive design and Snowflake's data capabilities allow for a rapid creation of an easy-to-use interface, ensuring that data quality corrections can be efficiently made.
Technical Details
The data is sourced from an external system and loaded each night. The researchers review each of the records and make minor fixes to values. For example recording a measurement in centimetres instead of metres, or correcting an identifier value. A researcher can manually add an entry if one has been missed. The interface can also be used to search for a shark using one of the many tag values, or by date caught.
Behind the scenes there is a simple select query that is executed against the Snowflake table to return a matching set of records. The returned dataset is displayed using a Streamlit data_editor, which provides a table display that allows the user to edit and save their changes.
The interface is very simple and allows researchers to quickly make minor edits to resolve data quality issues.
Conclusion
In the field of data management, Snowflake stands as a powerhouse of scalability and performance, offering data professionals a robust platform for their diverse needs. While the Snowflake user interface provides a user-friendly environment for seasoned data experts, the challenge emerges when we consider business users who may not have a strong grasp of SQL.
The story of SharkSmart and its mission to track sharks along the New South Wales coast exemplifies this challenge. In a dynamic and unconventional data entry setting, data quality issues are bound to arise. The team's daily task involves meticulously reviewing new data, identifying errors, and ensuring the accuracy of critical information pushed into Snowflake. With no SQL expertise at their disposal, the need for a straightforward interface was paramount.
The solution came in the form of Streamlit within Snowflake, and it revolutionised the way data quality was addressed. The hosting of Streamlit within Snowflake not only simplified deployment but also ensured that data remained secure, leveraging Snowflake's robust security protocols. Streamlit offered an intuitive design that allowed for the rapid creation of a user-friendly interface, enabling the SharkSmart team to efficiently correct data quality errors.
In essence, the journey of SharkSmart and the innovative use of Streamlit within Snowflake underscore the transformative impact of simplicity in data management. The fusion of a user-friendly interface with a robust data warehouse empowers businesses and organisations to maintain data quality, regardless of their team's technical expertise. As we navigate the data-driven future, this approach to accessibility and efficiency is a beacon of progress in the world of data management.
Comments