Electronic products are evolving at lightning speed, driven by an insatiable demand for new consumer devices, energy, transport, robotics, connectivity, data and beyond. However, the processes behind designing and manufacturing electronics have remained largely unchanged, held back by cumbersome, time-consuming and outdated practices. That’s why Wizerr, a leader in AI innovation for the electronics industry, set out to build GenAI-powered teammates for component engineering that accelerates the time to design, engineer and procure parts by up to 80%.
Historically, product data used in electronics component engineering has been stuck in a labyrinth of unstructured data sheets, manuals, errata, API, and code documentation that requires deep domain expertise to unlock. Wizerr’s innovative solutions are teammates are pre-trained on power management, RF, wireless, and embedded systems. They are adept at interpreting complex electronics specifications, recommending technically accurate components, finding alternative parts, and designing block diagrams with precision and speed—leading to the most optimized Engineering BOM (Bill of Materials).
The Databricks Data Intelligence Platform was critical to solution development, giving Wizerr the ability to unify, scale, and operationalize data faster than ever before — and build a practical, scalable solution in a matter of weeks.
The Challenge: Scaling to a Million Datasheets
Datasheets for electronic components are dense, unstructured documents with tables, diagrams, and technical jargon. Traditional data pipelines struggle with the volume and complexity, due to several factors:
- Inconsistent Formats: Each datasheet is unique in layout, requiring adaptable parsing mechanisms.
- Rich Data Contexts: Large language models (LLMs) used to power tools like ChatGPT have known challenges when interpreting numeric values from complex tables, figures, graphs, PDFs etc. Moreover, extracting and interpreting specifications (such as voltage ranges or current outputs) demands accurate numeric reasoning combined with industry-specific semantic reasoning.
- Scaling Requirements: Processing a million datasheets in bulk and supporting real-time operations with high throughput and low latency, while maintaining data integrity and accuracy.
- Model Iteration: Training, experimenting with, and refining models to extract complex information from datasheets and optimize GenAI models for accurate, context-aware query responses.
Where traditional data pipelines struggled with the volume and complexity of such tasks, Databricks’ robust ecosystem substantially improved Wizerr’s ELX AI engine and workflows.
How Databricks Simplified Complex Workflows
1. Parallelized Ingestion with Spark
Using Apache Spark™’s distributed computing capabilities, Wizerr was able to ingest and parse thousands of datasheets concurrently. Databricks’ optimized runtime for Apache Spark significantly reduced processing time. When combined with partitioning and Z-ordering, an ingestion that previously took days could be done in a matter of hours, saving more than 90% of the cost and time for ingestion.
Spark integration with Pandas in Databricks helped Wizerr migrate their pipeline to Databricks, providing a seamless data manipulation experience and lowering the learning curve for teams transitioning to distributed data processing.
Along with cost and time reduction, Databricks also enhanced error handling and traceability during processing. The platform’s Delta Lake ACID compliance and structured logging made it simple for Wizerr to isolate and debug errors at specific stages and data entries, instead of having to rerun the entire pipeline.
2. Enhanced Data Governance with Unity Catalog
For Wizerr’s enterprise customers, Unity Catalog played a pivotal role in managing data securely and transparently. Key benefits included:
- Centralized Metadata: Unified storage for data schema and lineage, making it easier to track data transformations.
- Role-Based Access: Securely granting access to sensitive data, ensuring compliance with industry standards.
- Cross-Team Collaboration: Allowed multiple teams to access relevant datasets without duplication or data silos.
3. Scalable AI Model Training
Databricks’ MLflow integration gave Wizerr the ability to seamlessly incorporate fine-tuned language models into their pipeline, streamlining training and deployment:
- Model tracking: MLflow made it easy to experiment with different LLMs (such as Llama 3.1 8B instruct and Mistral 7B instruct) and quantization methods and compare metrics such as latency, throughput, accuracy, and precision. Based on their initial results, Wizerr is considering hosting its own fine-tuned LLM using Databricks serving and hosting services in the future.
- Hyperparameter tuning: tuning: Databricks Mosaic AI Training facilitated efficient hyperparameter optimization by tracking parameter configurations and their impact on model performance for varied experimental setups.
- Versioning and deployment: MLflow’s model registry streamlined the transition from experimentation to production, simplifying version control and ensuring reliable model deployment.
4. Collaborative Model Workbench
Databricks’ collaborative environment became Wizerr’s central hub for evaluating model performance. Side-by-side comparisons enabled the team to compare outputs for extracting specifications like “Voltage – Output (Min)” or “Current – Output.” Visualization tools simplified the debugging process with detailed visualizations of model predictions and errors. The Databricks Platform also facilitated iterative improvements by allowing engineers, data scientists, and domain experts to collaborate in real time.
5. Dynamic Autoscaling for Cost-Effective Compute
Databricks’ autoscaling clusters dynamically adjusted to match Wizerr’s workload intensity. During peak ingestion periods, clusters automatically scaled up to handle high throughput and automatically scaled down during idle periods, optimizing resource usage and reducing costs.
6. Medallion Architecture and Delta Tables
Thanks to the integration of Delta tables, Unity Catalog and Spark, Wizerr can seamlessly access databases both inside and outside the Databricks environment. This has helped Wizerr query tables with lesser code and make use of Spark’s distributed nature. As well, CRUD operations between Delta tables and SQL tables take much less time.
Storing processed data at each pipeline stage simplified error checks, while Delta table versioning enabled Wizerr to track changes, compare versions, and quickly roll back if needed, enhancing workflow reliability.
Results: Transforming Datasheet Processing
By integrating Databricks into their workflow, Wizerr achieved several benefits:
- Faster processing speed: Reduced datasheet ingestion and parsing time by 90%, handling 1,000,000+ datasheets in record time.
- Improved data integrity: Enhanced, open data governance with Unity Catalog ensured consistent and reliable outputs.
- Faster model iterations: MLflow and Databricks Workbench made it easier and faster to experiment with and fine-tune open source AI models.
- Effortless scalability: Databricks’ architecture enables Wizerr to scale effortlessly as data volumes continue to grow.
- Seamless collaboration: Unified tools brought together multiple teams, speeding up decision-making and innovation.
Why This Matters to Data Architects and Solution Engineers
Wizerr’s journey isn’t just about transforming electronics component engineering—it’s a blueprint for how any industry can operationalize complex AI workflows. By unifying data, leveraging domain-specific AI models, and operationalizing solutions at scale, Wizerr demonstrated what’s possible when the right tools meet the right vision. Databricks provides the flexibility and power to unify disparate data into actionable insights, build and deploy AI models quickly and at scale, and empower teams to deliver innovative, practical solutions faster than ever before.
Every industry has its challenges. Wizerr’s success shows that with the right platform, those challenges can become opportunities to revolutionize how we work.
This blog post was jointly authored by Arjun Rajput (Account Executive, Databricks) and Avinash Harsh (CEO, Wizerr AI).
Discover more from TrendyShopToBuy
Subscribe to get the latest posts sent to your email.