PostgreSQL - Tables

Created:2019-06-17  Last modified:2019-06-17


  1. Introduction

    1. Create a table

                      CREATE TABLE table_name (
                      column_name DOMAIN / TYPE [COLLATE ...] column_constraint [DEFAULT ...],
                      CONSTRAINT constraint_name table_constraint
                      ) INHERITS existing_table_name;
                  

      Example

                      CREATE TABLE public.film
                      (
                          film_id integer NOT NULL DEFAULT nextval('film_film_id_seq'::regclass), -- using a sequence generator with converting type to OID.
                          title character varying(255) COLLATE pg_catalog."default" NOT NULL,
                          description text COLLATE pg_catalog."default",
                          release_year year,
                          language_id integer NOT NULL,
                          rental_duration smallint NOT NULL DEFAULT 3,
                          rental_rate numeric(4,2) NOT NULL DEFAULT 4.99,
                          length smallint,
                          replacement_cost numeric(5,2) NOT NULL DEFAULT 19.99,
                          rating mpaa_rating DEFAULT 'G'::mpaa_rating,
                          last_update timestamp without time zone NOT NULL DEFAULT now(),
                          special_features text[] COLLATE pg_catalog."default",
                          fulltext tsvector NOT NULL,
                          CONSTRAINT film_pkey PRIMARY KEY (film_id),
                          CONSTRAINT film_language_id_fkey FOREIGN KEY (language_id)
                              REFERENCES public.language (language_id) MATCH SIMPLE
                              ON UPDATE CASCADE
                              ON DELETE RESTRICT
                      )
                      WITH (
                          OIDS = FALSE
                      )
                      TABLESPACE pg_default;
                  
    2. Drop a table

                      DROP TABLE public.film;
                  
  2. Schema

    In PostgreSQL, a schema is a namespace that contains named database objects such as tables, views, indexes, data types, functions, and operators.

                databases{
                    schemas{ // logical groups
                        tables,
                        functions,
                        views,
                        sequences,
                        types,
                    }
                }
            

    A schema has a owner.

    1. public schema

      Each database has a default schema called "public". If creating a object (table, function, ...) without specifying a schema, then it belongs to current schema.

                              select current_schema();
                          
                              -- schema is specified before the object with a period.
                              create table [schema-name.]student{
                  
                              }
                          
    2. Schema search path & Current schema

      Same as OS environment PATH that PostgreSQL use to search for an object without telling the schema name. The default schema path is

                                  SHOW search_path; -- the default value is "$user", public, where "$user" represents a schema with the current username.
                                  SET search_path TO qinnan, public; -- update search path.
                              
      The current_schema is the first avaliable schema on the search path. For example, if "qinnan" schema does not exist, then the current schema is public.
                                  SELECT current_schema();
                              

    3. Create schema

                              CREATE SCHEMA schema_name; -- created a schema owned by the current user;
                              CREATE SCHEMA schema_name AUTHORIZATION user_name; 
                              DROP SCHEMA myschema;
                              DROP SCHEMA myschema CASCADE;
                              -- Schema names beginning with pg_ are reserved for system purposes and cannot be created by users.
                          
  3. Data Types / Domains

    1. OID (Object Identifiers)
      1). Object identifiers (OIDs) is data type that is used internally by PostgreSQL as primary keys' type for various system tables.
      2). It is currently implemented as an unsigned four-byte integer, so it is very large and should not used as user-defined table's primary key.
      3). It has a couple of aliases, e.g. regclass.
                              CREATE TEABLE ....(
      
                              )
                              WITH (
                                  OIDS = TRUE
                              )
                              -- or 
                              CREATE TEABLE ....(
      
                              )
                              WITH OIDS;
                              -- create a system column with name oid and type OID to the table.
                              -- the oid column is hidden column, has to be explicitly query.
                              
                              select oid, * from table_name;
                          

      The system column, oid, is automatically assigned value. It is like a shared "sequence" with OID type.

    2. Integer
          smallint	2 bytes	small-range integer	-32768 to +32767
          integer	4 bytes	typical choice for integer	-2147483648 to +2147483647
          bigint	8 bytes	large-range integer	-9223372036854775808 to +9223372036854775807
                      

      PostgreSQL does not support tinyint (1 byte).

    3. String/Chars
          character varying(n), varchar(n)	variable-length with limit
          character(n), char(n)	fixed-length, blank padded
          text	variable unlimited length                            
                          

      ** text is not in the SQL standard, several other SQL database management systems have it as well. **

                                  -- If one explicitly casts a value to character varying(n) or character(n), then an over-length value will be truncated to n characters without raising an error. (This too is required by the SQL standard.)
                                  select 'qinnan'::char(4); -- 'qinn'
                          

      Charset: postgresql support multiple encoding methods.

                                  SHOW SERVER_ENCODING; -- the default is UTF8
      
                                  -- utf8, utf16 defined how to stored number (binary)
                                  -- unicode defined the mapping between character and number (binary)
                                  -- 秦 -> (unicode) 79E6
                                  insert into table_name (lastname) values (U&'\79E6')
                          

      MS SQL Server defined nvarchar and varchar. varchar only supports ASCII, but saving space. nvarchar supports unicode.

      Collation

      char, varchar and text are collatable data types, meaning they have different sorting/comparison rules.

    4. Enum

      Postgres supports Enum. Example, CarType_Lookup_table

      Without supporting Enum

      CarTypeId in database is defined as a tinyint or smallint (depending on size). When code/program is accessing this field, it is also a number (byte/short ...), and then convert it to a Code defined Enum by developer.

      With Enum

      As the developer's prespective, enum is like string in database, but the underlying storage is int. In code, developers also need to define a code-side Enum. And supply a conversion from string to code-defined enum in (EF) framework But when using it in developer's code, it directly becomes a enum. (don't need to convert it in developer's code)

                                  -- example of SQL defined enum type.
                                  CREATE TYPE public.mpaa_rating AS ENUM
                                  ('G', 'PG', 'PG-13', 'R', 'NC-17'); 
      
                                  create table public.MovieTypeLink(
                                      link_id BIGSERIAL,
                                      movie_id int not null,
                                      movie_type mpaa_rating not null
                                  )
      
                                  insert into public.MovieTypeLink (movie_id, movie_type) values (1, 'PG')
                                  -- 22P02: invalid input value for enum mpaa_rating: "PsG"
                                  -- insert into public.MovieTypeLink (movie_id, movie_type) values (1, 'GP');
                          
    5. Serial/Identity

      Serial is a group of speical data types, it defines smallserial, serial, and bigserial. If a column has this type, it will have a default value as auto incremented.

                                  CREATE TABLE tablename (
                                      colname SERIAL 
                                  );
                                  -- ====> it actually creates a sequence with binding to this column. (delete column, also delete sequence)
                                  -- And it is marked as a default value, meaning it can be explicitly inserted with different value.
                                  -- It is not null but not marked as unique.
                                  CREATE SEQUENCE tablename_colname_seq AS integer;
                                  CREATE TABLE tablename (
                                      colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
                                  );
                                  ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;
                          

      Identity is a SQL standard, (MS SQL server also supports it), it a more configurable 'serial'. It can be set default or always, the sequence is also configurable. PostgreSQL version >= 10

                              -- column_name type GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY[ ( sequence_option ) ]
                              -- type can be SMALLINT, INT, or BIGINT
                              CREATE TABLE tablename (
                                  colname INT GENERATED ALWAYS AS IDENTITY (START WITH 10 INCREMENT BY 10)
                              );
      
                          
    6. Datetime

      timestamp: date + time, there are two version of timestamp, with/without timezone.
      timestamp [timestamp without time zone]
      timestamptz [timestamp with time zone]

      the timestamptz is actually same as timestamp in terms of storage (both are 8 bytes), the different is timestamptz need to the db server's timezone, and convert time to UTC timezone and then store it. When retrieve its value, convert it back to current timezone.

                              -- select current timezone.
                              show timezone;
                              -- current time in db server's timezone
                              select now(); 
                              select CURRENT_TIMESTAMP;
      
                              -- insert value is as a string with a certain format.
                              -- '2016-06-22 19:10:25+08 
      
                              -- convert a UTC+8 to us eastern.
                              select timezone('US/Eastern', '2016-06-01 19:10:25+8');
                          
    7. Define a domain
    8. Type casting
                              select '2'::oid; -- :: is postgres shortcut, it is not a sql standard.
                              select cast('12' as oid); -- standard SQL casting.
                      
  4. Constraints

    Common constraints

    1. NOT NULL:
  5. Sequences

    A sequence in PostgreSQL is a user-defined "schema-bound" sequence (integer) number generator that generates a sequence of integers based on a specified specification.

    It is usually used along with a primary key.

    1. Definition Syntax

                          CREATE [ TEMPORARY | TEMP ] SEQUENCE [ IF NOT EXISTS ] name
                          [ AS data_type ] -- Valid types are smallint, integer, and bigint. bigint is the default
                          [ INCREMENT [ BY ] increment ]
                          [ MINVALUE minvalue | NO MINVALUE ] [ MAXVALUE maxvalue | NO MAXVALUE ]
                          [ START [ WITH ] start ] 
                          -- The optional clause CACHE cache specifies how many sequence numbers are to be preallocated and stored in memory for faster access. 
                          -- The minimum value is 1 (only one value can be generated at a time, i.e., no cache), and this is also the default.
                          [ CACHE cache ] 
                          [ [ NO ] CYCLE ]
                          [ OWNED BY { table_name.column_name | NONE } ];
      
                          CREATE SEQUENCE public.film_film_id_seq
                          INCREMENT 1
                          START 1000
                          MINVALUE 1
                          MAXVALUE 9223372036854775807
                          CACHE 1;
                          
                          ALTER SEQUENCE public.film_film_id_seq
                          OWNER TO postgres;
                          
    2. Select with functions

                              select nextval('public.film_film_id_seq'); -- 1001
                              select currval('public.film_film_id_seq'); -- 1001
                              select setval('public.film_film_id_seq', 3); 
                              select nextval('public.film_film_id_seq'); -- 4
                          
    3. Show all sequences

                                  SELECT c.relname FROM pg_class c WHERE c.relkind = 'S';
                                  -- pg_class is a system catalog ????, which stores metadata.