My product team comprising of Business Analysts, Area Architect, Domain Architects, Product Owner, Developers and Testers, is part of a product domain called Fuel and Environment. We empower fleet owners by delivering insightful reports, facilitating a deeper understanding of fuel and energy efficiency across their vehicle fleet. Our comprehensive analytics cover all Volvo Group owned vehicles, and we offer customizable dashboards, allowing customers to segment data by individual drivers and vehicles. Fleet owners can also analyze specific vehicle metrics through both tabular and graphical representations. Furthermore, we provide an automated reporting feature, enabling customers to receive consolidated performance summaries on a weekly or monthly basis for streamlined decision-making.
A few main use-case of reports
Within the Fuel and Environment service we manage a substantial volume of vehicles. Approximately 300.000, out of 1.7 million, connected vehicles are reporting data to us which means that we receive around 21,000 messages per second. This data must be stored efficiently to meet the diverse needs of the end users. Our service caters to all countries, which necessitates a 24-hour availability to accommodate global requirements
Our service also provides automated mechanisms, allowing fleet owners to receive their reports at predefined, scheduled intervals. Their fleets are of diverse sizes (ranging from small fleets of < 20 to large fleets sizes of > 1000 connected vehicles) and most reports are scheduled for end of the month and/or week. This challenges our service to be efficient in handling high request volumes, asynchronously approximately 1000 request per second.
Our system offers the flexibility of configuring scheduled reports, which places a technical responsibility on us to ensure that these reports are generated, before business hours in specific countries. We distribute approximately 22k reports every month, and 12k reports each week as part of our service.
Customers faced challenges with report delivery, particularly slow rendering times and delays in scheduled distributions. These issues hinder timely access to critical information, affecting decision-making processes. Addressing these performance bottlenecks is crucial to improve the user experience and ensure prompt and efficient report deliver.
Our infrastructure has been strained due to heavy database resource utilization, with high read/write IOPS and maxed-out CPU usage. This poses a significant risk of database crashes, potentially leading to total disruption.
Our service is developed using the latest version of Spring Boot, which handles most of the business logic, while the UI layer was built on a React.js portal framework. We host all our applications on AWS, using EC2 instances, and our database layer is Oracle 21. We chose Oracle for its ability to manage high data traffic and large volumes of data.
Before going into fixing the issue we did a deep analysis on the root cause, as well as an e2e evaluation of all bottlenecks that was impacting performance degradation.
Information Model
The information model was not aligned to the access pattern, so we did an analysis on what are the different access models. Below are a few points used as input for this analysis
Business logic
How is the code aligned to support report rendering?
User Interface
Our primary challenge was that the information model didn’t align with our access patterns, leading to inefficiencies and bottlenecks. Fixing this issue was extremely challenging and complicated as it required extensive modifications to the data model. Given the increasing volume of incoming traffic, we knew we needed to make these changes carefully and strategically. We decided to tackle the problem in a stepwise manner, allowing us to implement adjustments gradually while minimizing disruption. This approach not only made the process more manageable but also ensured that we could continuously monitor performance improvements along the way.
Step-by-step approach simple but expensive
Dig deeper into information model:
It’s crucial to regularly review the access patterns observed in production. These reviews help us determine whether the data is stored appropriately for its intended use.
The Client layer
After identifying the problem, finding an optimal solution was straightforward. The real challenge was determining how to implement it in production. We migrated the data from the old structure to the new one. We launched this feature gradually, rolling it out on a customer-by-customer basis. This approach not only allowed us to assess performance improvements but also ensured a thorough quality check of the rendered data.
What did we gain after all the changes?
· Customers could fetch 5 year reports.
· Better utilized database resource which helped in scaling down and saving cost.
Bio: Vinitha Rajagopal
I am a Senior Solution Architect working at Volvo Group Connected Solutions, currently focused on the Charging product area. I’ve been with this incredible company for 11 years, and I’m constantly amazed by how our products (trucks and buses) are more intelligent than the smartest phones. This has only deepened my passion for IoT. As an architect, I believe that business and technology should go hand in hand, and at Volvo Group, I get to do exactly that. Outside of work, I enjoy trekking, and being in the Himalayas is always a rejuvenating experience.