This PGDay Hyd 2024 is hosted by Hyderabad PostgreSQL User Group. This inaugural PGDay is dedicated to expanding the PostgreSQL community across India. PGDay aims to enhance PostgreSQL awareness and facilitate its adoption through targeted Talks, Round-table discussions, and Networking sessions.
We are committed to fostering a robust network that includes Developers, Database Administrators, End-users, and Businesses of all sizes — from small and medium enterprises to large multinational corporations and government bodies—that rely on PostgreSQL for their operational needs.
This keynote will explore the pivotal role PostgreSQL plays in advancing AI/ML within open-source ecosystems. The session will emphasize how PostgreSQL’s open-source nature democratizes access to cutting-edge tools, driving innovation and efficiency in AI adoption. Attendees will discover how PostgreSQL can support the creation of scalable, cost-effective data pipelines, enabling organizations to transition their AI/ML and GenAI use cases from experimentation to production with confidence and predictability.
PostgreSQL offers hundreds of tunable parameters, and setting them correctly can significantly improve application performance and save substantial resources. However, tuning these parameters requires considerable effort and expertise.
This presentation will introduce the fundamentals of database parameter tuning, highlighting its importance and exploring existing solutions. We will also provide a brief introduction to DBtune, our automated parameter tuning service, and how it supports organizations in automating this often-overlooked yet critical business task.
DBtune is a fully automated parameter tuning service that is both workload and machine resource aware. Leveraging machine learning technology, it customizes each optimization to the specific customer workload and hardware in use, ensuring optimal performance tailored to your unique environment.
The tech talk delves into similarity search and it’s benefit by implementing it through pgvector, which is an open-source extension that enables efficient vector similarity searches within PostgreSQL databases. It explains the underlying principles of vector embeddings and how pgvector leverages indexing techniques to accelerate similarity queries.
Additionally, the talk presents practical examples and code snippets demonstrating how to set up and utilize pgvector for vector similarity search use cases. It covers topics like recommendation system, indexing, and querying for similar vectors based on a given input vector.
Today’s modern applications must be Always-on, Always-available and serve a global user base that demands fast response times. A fully Multi-master Distributed Database architecture is critical for organizations to meet these demands. In this talk, we will explore the core use cases that benefit from deploying PostgreSQL with asynchronous logical replication in a Multi-region or even Multi-cloud environment. Learn how issues such as slow response times, maintenance or provider downtimes, data storage requirements, and throughput can easily be addressed.
You’ll see how pgEdge is demonstrating the value of Open source, pure Postgres, multi-master solution to improve data latency and high availability as well as address data residency requirements.
Key takeaways:
By attending this session, you will gain valuable insights into the world of scaling PostgreSQL for high-traffic applications. Whether you are a developer, architect, or database administrator, this talk will equip you with the knowledge and best practices necessary to design and implement a scalable and performant Multi-master Distributed Database architecture.
In this talk, I’ll delve into the different use cases of declarative partitioning in PostgreSQL.
While it offers significant benefits, there are limitations even in the latest PostgreSQL versions.
I’ll focus on how the planner handles various query patterns on partitioned tables.
For instance, plan-time pruning occurs only when the partition key is hardcoded. If the partition key comes from a subquery or a common table expression (CTE), the planner includes all partitions.
Additionally, query planning time increases as the number of child partitions grows.
In the evolving landscape of technology, the synergistic integration of databases and artificial intelligence/machine learning (AI/ML) is pivotal in driving innovation and efficiency across various sectors. PostgreSQL (Postgres), a robust and versatile open-source relational database management system, stands at the forefront of this transformation. Its advanced features, extensibility, and performance make it an ideal foundation for AI/ML applications. This paper explores the dynamic interplay between Postgres and AI/ML, highlighting AI/ML frameworks like TensorFlow, PyTorch, and scikit-learn enhances intelligent applications. Postgres’s robust features streamline workflows, reduce latency, and support complex data types, enabling predictive analytics and real-time processing. This synergy boosts competitive advantages and improves decision-making.
The talk dives into diagnostic capabilities of PostgreSQL’s EXPLAIN ANALYZE command with emphasis on BUFFERS option. The topic although is interesting but can feel daunting to lots of people. This talk with help understand a bit more deeply in to that. We will cover how this tool can provide crucial insights into query execution plans and buffer usage, which are essential for optimizing database performance. We will explore the significance of buffer statistics, such as blocks hit and read, and discuss specific scenarios where these metrics can be misleading, such as in nested loops where shared hit counts may be double-counted. We will also have practical examples with the queries running and demonstrate what block hits vs block reads indicates. We will end with a look at how we did apply the diagnosis for running one of our GetPostsBatchForIndexing to make it faster.
Session will cover all curated new features listed in PostgreSQL 17 Beta release that will benefit Database Developers or Migration Engineer.
It will highlights key features around Development Will share working snippet of all features and share use cases it will benefit. |
PostgreSQL is first and foremost an OLTP DBMS. But it has a rich set of features which allows it to be used as an OLAP DBMS. It is highly compliant with SQL standard, providing analytical features like windowing. It supports many data types, including Geo-spatial datatypes and JSON, which are important for analytical workloads. It supports full-text search on document data types and support non-traditional indexes like GIN, BRIN along side partial and expression indexes. Using physical or logical replication one can create a low maintenance ELT system, to separate OLAP and OLTP workloads. It is extensible through Extensions, which extend its already rich ecosystem by adding complex data types, user defined functions etc. Parallel querying capability allows processing larger datasets in lesser time. Foreign data wrappers allow querying external data sources in todays hybrid data models. Partitioning along with FDW paves path for horizontal scaling.
Yet when it comes to processing TBs for data it falls short of the other DBMSes built specially for analytical workloads. However PostgreSQL ecosystem is stepping up to this challenge. Once of the current trends is to use a combination of cloud columnar storage format, columnar execution engine and FDW. Columnar execution engine and columnar storage make it possible to process large amounts of data efficiently. FDW provide a way to interface columnar engine from PostgreSQL. PostgreSQL itself provides rich SQL and handles authentication, governance and metadata. These systems are easier to maintain, do not involve data movement and are yet reasonably efficient.
The talk will explore PostgreSQL’s offerings in the analytical space.
As the demand for scalable and responsive database solutions grows, managing PostgreSQL connections effectively becomes paramount. In this talk, we delve into the intricacies of scaling PostgreSQL connections, addressing the pivotal issues of performance optimization and cost efficiency.
Our topic encompasses a comprehensive exploration of the following key topics:
Understanding the Cost of Connections:
We dissect the overhead associated with PostgreSQL connections, shedding light on the implications and performance bottlenecks incurred by inefficient connection management.
The Imperative Need for Connection Pooling:
Recognizing the limitations of direct client-to-server connections, we underscore the necessity of implementing a robust connection pooling mechanism to alleviate strain on database resources and enhance scalability.
Exploring various connection Pooling options:
we explore options for connection pooling and load balancing. By comparing various solutions, we equip participants with the knowledge to make informed decisions tailored to their specific requirements.
The talk firstly introduces all pertinent levels of database monitoring and then focuses on PostgreSQL and the means it provides. The meaning and importance of key metrics will be explained. As the Postgres community has already developed a lot of tools in that area, some popular common options will be highlighted together with the problems that different monitoring approaches have. To overcome some of these problems an Open Source tool from Cybertec, called pgwatch, is introduced and explained in detail to offer the simplest possibly entry into exhaustive Postgres monitoring. Also discussed will be advanced topics like anomaly detection and alerting, which can be easily implemented on top of the underlying data tier (TimescaleDB) with the help of the TICK stack or Grafana.
Managing Data Definition Language (DDL) changes in distributed PostgreSQL environments has traditionally been a manual and error-prone process, hindering efficiency and scalability. In this talk, we introduce the groundbreaking feature of automatic DDL replication with pgEdge. Join us to explore how this innovation simplifies the propagation of schema changes across multiple nodes, eliminating the need for manual interventions and reducing the risk of inconsistencies. We will delve into the technical details of what automatic DDL replication entails, how it integrates with existing PostgreSQL infrastructure, and the steps required for effective implementation. Attendees will learn about the configuration process, best practices to ensure seamless replication, and potential pitfalls to avoid. Key-takeaways: |
What is PostgreSQL?
PostgreSQL is a powerful, open source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads. The origins of PostgreSQL date back to 1986 as part of the POSTGRES project at the University of California at Berkeley and has more than 35 years of active development on the core platform.
PostgreSQL has earned a strong reputation for its proven architecture, reliability, data integrity, robust feature set, extensibility, and the dedication of the open source community behind the software to consistently deliver performant and innovative solutions. PostgreSQL runs on all major operating systems, has been ACID-compliant since 2001, and has powerful add-ons such as the popular PostGIS geospatial database extender. It is no surprise that PostgreSQL has become the open source relational database of choice for many people and organisations.
Getting started with using PostgreSQL has never been easier – pick a project you want to build, and let PostgreSQL safely and robustly store your data.
PostgreSQL is a powerful, open source object-relational database system with over 35 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance.
There is a wealth of information to be found describing how to install and use PostgreSQL through the official documentation. The open source community provides many helpful places to become familiar with PostgreSQL, discover how it works, and find career opportunities. Learn more on how to engage with the community.