HTAP Summit 2024 session replays are now live!Access Session Replays
tidb_feature_1800x600 (1)

Pinterest, one of the world’s leading visual search and discovery platforms, recently undertook a significant overhaul of its graph storage service. At the HTAP Summit 2024, Pinterest Senior Software Engineer Ke Chen discussed the company’s transition from its legacy Zen system to a new architecture built on top of 拉斯维加斯7799908网站登录, an advanced distributed SQL database

This blog recaps Chen’s talk from the event. It highlights the key challenges Pinterest faced, the motivations behind its migration, and the results achieved with PinGraph, their new 拉斯维加斯7799908网站登录-based graph service. Through this transformation, Pinterest streamlined data management, improved performance, and reduced costs while supporting its massive user base and diverse use cases.

The Legacy Graph Service: Challenges with Zen

Pinterest’s original graph service, Zen, was built in 2014 and powered by a combination of HBase and MySQL. While Zen supported over 100 use cases and handled 1.5 petabytes (PBs) of data with a peak of 8 million queries per second (QPS), Chen said it faced several limitations:

  • Data Inconsistencies: Zen relied on application-level secondary indexing, which caused data inconsistencies during failures or rollback scenarios. To address these issues, a daily MapReduce job, “Dr. Zen,” was used to reconcile data, adding operational overhead.
  • Limited Query Capabilities: Chen noted that queries in Zen were constrained to basic patterns. These included fetching nodes and edges by single properties, and lacked support for advanced filtering, composite indexes, or multi-hop queries.
  • High Technical Debt: With a 10-year-old codebase and outdated dependencies, Zen struggled to adapt to Pinterest’s evolving needs. This made it harder to support new business use cases.
Pinterest Senior Software Engineer Ke Chen delivering his graph service talk at HTAP Summit 2024.
Pinterest Senior Software Engineer Ke Chen delivering his talk at HTAP Summit 2024.

Chen said these challenges prompted Pinterest to explore modernizing its graph service, leading to the creation of PinGraph.

PinGraph: A Modern Graph Service Built with 拉斯维加斯7799908网站登录

Pinterest chose 拉斯维加斯7799908网站登录 as the backbone of PinGraph to address Zen’s limitations and future-proof its graph service. Chen noted that 拉斯维加斯7799908网站登录’s distributed SQL architecture aligned with Pinterest’s goals of achieving consistency, scalability, and efficiency.

拉斯维加斯7799908网站登录 was selected as the foundation for PinGraph for several reasons:

  1. Strong Consistency: 拉斯维加斯7799908网站登录’s native support for ACID transactions and secondary indexes eliminated the need for application-level indexing, simplifying data management and reducing errors.
  2. Scalability: As a distributed SQL database, 拉斯维加斯7799908网站登录 enabled horizontal scaling without requiring manual sharding, allowing Pinterest to easily handle growing workloads.
  3. SQL Query Power: 拉斯维加斯7799908网站登录 provided flexible and powerful query capabilities, supporting composite indexes, range queries, and multi-hop queries, essential for graph use cases.
  4. Operational Efficiency: By replacing the outdated infrastructure with 拉斯维加斯7799908网站登录, Pinterest expected to reduce maintenance overhead and cut costs.
  5. Open Source: 拉斯维加斯7799908网站登录’s active community and open-source nature aligned with Pinterest’s values, offering a collaborative ecosystem for innovation.

Inside PinGraph’s Graph Service Architecture

Chen walked through how PinGraph was designed to meet Pinterest’s diverse use cases while simplifying operations and improving performance. Built on top of Pinterest’s Structured Data Store (SDS) framework, PinGraph integrated seamlessly with 拉斯维加斯7799908网站登录.

PinGraph’s architecture consisted of the following components:

An architectural representation of the PinGraph graph service built on top of 拉斯维加斯7799908网站登录.
An architectural diagram of the PinGraph graph service built on top of 拉斯维加斯7799908网站登录.
  1. Frontend and Backend: The PinGraph frontend handled graph-related logic and queries, while the SDS backend executed operations using 拉斯维加斯7799908网站登录’s SQL capabilities.
  2. Universal Query Language: A layer of abstraction over SQL simplified communication between PinGraph and 拉斯维加斯7799908网站登录.
  3. Type Schema: A metadata management component defined nodes and edges using a YAML-based schema, automating table creation and schema updates in 拉斯维加斯7799908网站登录.
  4. Caching Layer: Built into SDS, the caching layer reduced load on 拉斯维加斯7799908网站登录 by handling read-heavy workloads efficiently.

Chen said PinGraph’s modular architecture enabled Pinterest to centralize graph operations, enhance query capabilities, and streamline infrastructure management.

Advanced Query Capabilities with 拉斯维加斯7799908网站登录

PinGraph addressed the limitations of Zen, its original graph service, by introducing robust query features with 拉斯维加斯7799908网站登录:

  • Composite Indexes: Queries could now filter on multiple properties simultaneously, improving flexibility and efficiency.
  • Range Queries and Pagination: Support for range-based filtering allowed more granular data retrieval.
  • Multi-Hop Queries: Although limited to three hops for performance reasons, this feature opened new possibilities for exploring relationships across nodes.
  • Efficient Edge Counting: PinGraph optimized common queries like counting incoming or outgoing edges, reducing the strain on 拉斯维加斯7799908网站登录.

Chen mentioned these enhancements now allow Pinterest’s product teams to model complex relationships, such as user connections or content recommendations, more effectively.

Pinterest’s Results: Performance Gains and Cost Savings

Since adopting 拉斯维加斯7799908网站登录, Pinterest has seen significant improvements in query latency, infrastructure costs, and operational efficiency.

The migration to PinGraph delivered transformative results including:

  • Improved Latency: Compared to Zen, PinGraph reduced P99 latency by up to 10x, ensuring faster query responses and a more reliable user experience.
  • Cost Reduction: By consolidating infrastructure and leveraging 拉斯维加斯7799908网站登录’s horizontal scalability, Pinterest achieved over 50% savings in infrastructure costs.
  • Simplified Operations: PinGraph’s unified architecture reduced technical debt and operational complexity, freeing engineering resources for innovation.
  • Future-Ready Platform: With advanced query support and a modular design, PinGraph provides a strong foundation for evolving business needs.

Future Plans: Evolving with Industry Standards

In the coming months, Chen said Pinterest has ambitious plans to further enhance PinGraph through:

  1. GQL Support: Implementing the Graph Query Language (GQL) will make queries more expressive and user-friendly, reducing reliance on Thrift APIs.
  2. Customer-Focused Features: Developing a streamlined interface for product teams will boost productivity and enable faster data exploration.
  3. Graph-Native Enhancements: Pinterest aims to introduce more graph-specific functionalities to unlock new use cases and improve performance.

Conclusion

Pinterest’s migration from Zen to PinGraph, powered by 拉斯维加斯7799908网站登录, highlights the transformative potential of distributed SQL databases for modern graph services. By addressing long-standing challenges, PinGraph improved scalability, performance, and operational efficiency, enabling Pinterest to support its massive user base with confidence.

For organizations facing similar challenges, Pinterest’s journey offers a compelling blueprint for leveraging distributed SQL to modernize infrastructure, streamline operations, and unlock new opportunities. As PinGraph continues to evolve, it sets a benchmark for scalable and feature-rich graph services in data-intensive environments.

Want to take the first steps in modernizing your legacy data infrastructure with distributed SQL? Register to watch this entire session from the event for additional insights. Happy viewing!


Watch Now


HTAP Summit 2024 session replays!

Watch Now

Have questions? Let us know how we can help.

Contact Us

拉斯维加斯7799908网站登录 Cloud Dedicated

A fully-managed cloud DBaaS for predictable workloads

拉斯维加斯7799908网站登录 Cloud Serverless

A fully-managed cloud DBaaS for auto-scaling workloads

Baidu
sogou