Metropolis DevZone > Data Integrator's Guide > Prism Overview

Prism is a software component that lets you quickly integrate external databases into Palantir. Specifically, it lets you build high-performance Data Engine based providers without writing any code. Instead, you define simple configuration files and then Palantir automatically constructs the data provider and database code for you. This ensures that all data access goes through well tested, high-performance code paths. Also, you can iterate more quickly because you can modify and reload Prism-based data providers without restarting the server.

This page covers the following topics:

Is Prism Right for My Data?

Prism supports the following data sources:

  • MS SQL Server
  • PostgreSQL
  • Oracle (Prism does not support Oracle PL/SQL. For more information, see Oracle PL/SQL is not fully supported)
  • Netezza
  • CSV (Comma-separated values) files*

Prism works by mapping the columns in your database into models and metrics in the system. It works well when your data set has an obvious column or columns to use as the model identifier (usually the primary key). We'll see examples of good data sets later.

Note that the use of CSV files is meant for proof of concept type of applications only and will cause performance problems if used at production level scale. At that level of scale you should use a database such as Postgres. For examples of how to use Postgres with Prism, see Prism Examples.

Prism vs Web API

Prism is different than the Web API Data Provider and the Importer application because it connects your database directly to Palantir. You don’t store a separate copy of the data in the Web API database. Thus, you don't need to fit your data into the Web API's database format. Instead, your Prism configuration file contains mappings from your external data source to the models and metrics you want to see in Palantir.

Creating Your First Prism Provider

Let's create a Prism data provider that exposes some stocks in a CSV file. When we're done, you'll be able to access the CSV data from applications like Explorer and Chart.

Prerequisites

You should know how to access the file system on the Palantir dispatch server (usually through secure shell).

Turn on Prism

Open the file config.properties and set enabled to true. This file is located in the Prism folder on your dispatch server (typically /opt/palantir/<server>/providers/prism/. It looks like this:

# provider-specific properties for Prism
pf.provider.prism.enabled=true
There are many config.properties files for different components on the server. Make sure you edit the file in the Prism folder. For more information, see Property Files

Now that Prism is on, the server creates a data provider for every Prism configuration file in the <server>/providers/prism folder. So, let's create our first Prism config file.

Create a config file

  1. Create a file called MyFirstProvider.carbon in the Prism folder (<server>/providers/prism).
    Files that end with *.carbon are called Prism configuration files. The provider's name matches the filename, so it will be MyFirstProvider.
  2. Copy and paste the following settings into the file.
    import com.palantir.finance.commons.data.model.*
    
    Config {
        Csv {
            rootDirectory = './providers/prism'
            nullValues = ['null', 'empty', 'n/a']
        }
    }
    
    CommonStock{
        File {
            filename = 'MyProperties.csv'
            modelSource = true
            id = [tokens: ['symbol'], columns:['model']]
            properties << ['name','description']
        }	
    }
    

    Prism config files contain Carbon script, which is a language Palantir created specifically for Prism. Its syntax is based on Groovy, but don't worry if you aren't familiar with Groovy; we'll cover all you need to know in later examples. Let's see what this file is doing:

    • The Config block contains info about your data source. Usually, it contains database information like db type, url, and password, but because we are connecting to a CSV data source, it contains the location and format of the CSV files.
    • The CommonStock block maps the data in the CSV file to models and metrics in the system. In this case, it says that the data in MyProperties.csv should be put into models of type CommonStock. Also, it says to map the CSV columns model, name, and description to the properties symbol, name, and description respectively.
    • The import line lets you specify model types in the system without having to type their full name. For example, you get to type CommonStock instead of com.palantir.finance.commons.data.model.CommonStock.
  3. Put MyProperties.csv in your Prism directory. This directory must match the rootDirectory setting in your config file.

Load the Prism Provider

At this point, we have defined our data provider but the Palantir server has not loaded it. There are two ways to load the new data provider:

  • restart the dispatch server either from Palantir Configuration Server (PCS) (if installed) or from the command line by running one of the scripts in <server>/bin.
  • run the MBean operation com.palantir.finance.data -> PrismProvider -> addPrismProvider. To access the MBean, you must connect to the server via JMX (usually with JConsole). For more information, see Monitoring and Managing Servers with JMX.

After loading the provider, you should be able to see your data. Here what it looks like in the Explorer application.

Notice that there is a model for every row in the CSV file and also a model for the data provider itself. Also, we can see the model's symbol, name, and description.

Add Timeseries Data

  1. Download and put MyTimeseries.csv in your Prism directory.
    This file contains closing price and volume data for STOCK1 and STOCK2.
  2. Modify the CommonStock block of your config file to look like this:
    CommonStock{
        File {
            filename = 'MyProperties.csv'
            modelSource = true
            id = [tokens: ['symbol'], columns:['model']]
            properties << ['name','description']
        }	
        File
        {
            filename = 'MyTimeseries.csv'
            id = [columns:['model']]
            Timestamp {
                dateColumn = Column(name: 'date', dateFormat: FLEXDATE)
                metrics << 'close' << 'volume'
            }
        }
    }
    

    Here's what is going on:

    • This code adds a second File block for our second CSV file. Unlike the previous File block, this block is not a modelSource. This mean that we're not creating models from the data in this file. Instead, we're adding the data to models created in the other block. The models in this CSV file are mapped to the models in the CSV file by their id, which in this case, is the column model.
    • The Timestamp block says that our CSV file has properties or metrics that change over time. The dateColumn variable defines the name and format of the column that contains dates.
    • We define metrics close and volume in the Timestamp block because they have different values at different times.
  3. Run the MBean operation com.palantir.finance.data -> <your_prism_provider_name> -> reloadProvider. This tells Prism to reload the configuration file and read the new CSV data. Note that the reloadProvider MBean will only pick up changes in the configuration file; if you alter the data only in the CSV file, you will need to restart your server to see the effects.
    If you made changes to the data, you may need to clear Palantir's data cache to see the changes from the client. To clear the cache, run the MBean operation com.palantir.finance.service -> Caches -> invalidateAll().
  4. (Optional) View your data in the Chart application by entering STOCK1 AND STOCK2.

Previous Versions

Version 3.18

  • Prism now supports streaming results read from the database out to the data platform. This reduces the memory footprint for computations based on data in Prism providers because the system no longer needs to collect all results in memory before returning them.
  • Intraday Carbon files on version 3.18.1 will break when you upgrade to version 4.0.2. This is caused by ITD changes, so these files will need to be changed as they break.

Version 3.16

  • You can now add documentation (descriptions, examples, tags) to Prism properties and metrics. This documentation appears in the metric flyout window. For more information, see Carbon Reference#Documentation and Prism Examples#Parameters and Documentation.
  • You can now add Prism providers without restarting the dispatch server by running the MBean com.palantir.finance.data.PrismProvider.addPrismProvider.
  • You can now return models in ObjectTimeSeries. In previous versions, ObjectTimeSeries could not contain models. For an example, see Prism Examples#Object Time Series - Models.

Version 3.15

  • You can now create metrics that return objects. In previous version, metrics must return TimeSeries - a set of time/number ticks. Now, metrics can also return ObjectTimeSeries - a set of time/object ticks. For an example, see Prism Examples#Object Time Series - Strings.
  • Prism now supports providers that expose only CSV data. In previous versions, the database configuration block was required.
  • For Prism providers, the default value for the Table threshold setting has been increased to .5 (50%). In previous versions, the default value was .2 (20%). For more information, see Carbon Reference#Table Config.

Version 3.14

In version 3.14, Prism providers must connect to a SQL database. This means every provider must have a valid Database block in the config section. In versions 3.15 and later, you can omit the Database block if your provider exposes only CSV data. For more information, see Carbon Reference#Database.

Common Issues and Questions

Oracle PL/SQL is not fully supported

Specifically, Prism does not support Oracle PL/SQL that uses varchar2 fields bigger than 4000 bytes (PL/SQL supports up to 32KB).

PL/SQL will work if you have either 1) a threshold set to 0, 2) if you don't use varchar2, or 3) if your max_size in varchar2(max_size) is < 2000.

Use lowercase column names

Because support for mixed case column names is vendor specific, you may encounter errors if your database contains mixed case column names. Prism supports lowercase column names for all supported databases.

See Also

Carbon Reference
Prism Examples

Need Help? Email us at           © 2014 Palantir Technologies  ·  Terms of Use  ·  Privacy and Security Statement