Portable Format for Analytics
{{Infobox file format
| name = Portable Format for Analytics
| icon = PFA Logo-200x200.png
| iconcaption =
| icon_size =
| screenshot =
| screenshot_size =
| caption =
|_noextcode =
| extension =
|_nomimecode =
| mime =
| type_code =
| uniform_type =
| conforms_to =
| magic =
| developer = Jim Pivarski
Data Mining Group
| released =
| latest_release_version = 0.8.1
| latest_release_date = {{start date and age|2015|11|10}}
| type = Predictive modelling
| container_for =
| contained_by =
| extended_from = JSON
| extended_to =
| standard =
| free =
| open =
| url = {{URL|https://dmg.org/pfa/}}
}}
The Portable Format for Analytics (PFA) is a JSON-based predictive model interchange format conceived and developed by Jim Pivarski.{{citation needed|date=December 2017}} PFA provides a way for analytic applications to describe and exchange predictive models produced by analytics and machine learning algorithms. It supports common models such as logistic regression and decision trees. Version 0.8 was published in 2015. Subsequent versions have been developed by the Data Mining Group.{{cite web |url=http://dmg.org/ |title=Data Mining Group |accessdate=December 14, 2017 |quote=The DMG is proud to host the working groups that develop the Predictive Model Markup Language (PMML) and the Portable Format for Analytics (PFA), two complementary standards that simplify the deployment of analytic models.}}
As a predictive model interchange format developed by the Data Mining Group, PFA is complementary to the DMG's XML-based standard called the Predictive Model Markup Language or PMML.{{cite web|url=http://www.kdnuggets.com/2016/01/portable-format-analytics-models-production.html|title=Portable Format for Analytics: moving models to production|accessdate=April 25, 2016}}
Release history
class="wikitable"
! Version !! Release date | |
Version 0.8.1 | November 2015 |
Data Mining Group
The Data Mining Group is a consortium managed by the Center for Computational Science Research, Inc., a nonprofit founded in 2008.{{cite web|url=https://www.citizenaudit.org/261866627/|title=2008 EO 990|accessdate=16 Oct 2014}}
Examples
- reverse array:
# reverse input array of doubles
input: {"type": "array", "items": "double"}
output: {"type": "array", "items": "double"}
action:
- let: { x : input}
- let: { z : input}
- let: { l : {a.len: [x]}}
- let: { i : l}
- while : { ">=" : [i,0]}
do:
- set : {z : {attr: z, path : [i] , to: {attr : x ,path : [ {"-":[{"-" : [l ,i]},1]}] } } }
- set : {i : {-:[i,1]}}
- z
- Bubblesort
input: {"type": "array", "items": "double"}
output: {"type": "array", "items": "double"}
action:
- let: { A : input}
- let: { N : {a.len: [A]}}
- let: { n : {-:[N,1]}}
- let: { i : 0}
- let: { s : 0.0}
- while : { ">=" : [n,0]}
do :
- set : { i : 0 }
- while : { "<=" : [i,{-:[n,1]}]}
do :
- if: {">": [ {attr: A, path : [i]} , {attr: A, path:[{+:[i,1]}]} ]}
then :
- set : {s : {attr: A, path: [i]}}
- set : {A : {attr: A, path: [i], to: {attr: A, path:[{+:[i,1]}]} } }
- set : {A : {attr: A, path: [{+:[i,1]}], to: s }}
- set : {i : {+:[i,1]}}
- set : {n : {-:[n,1]}}
- A
Implementations
- [https://github.com/opendatagroup/hadrian Hadrian] (Java/Scala/JVM) - Hadrian is a complete implementation of PFA in Scala, which can be accessed through any JVM language, principally Java. It focuses on model deployment, so it is flexible (can run in restricted environments) and fast.{{Citation|title=Implementations of the Portable Format for Analytics (PFA): opendatagroup/hadrian|date=2019-08-15|url=https://github.com/opendatagroup/hadrian|publisher=Open Data Group|access-date=2019-11-22}}
- [https://pypi.org/project/titus/ Titus] (Python 2.x) - Titus is a complete, independent implementation of PFA in pure Python. It focuses on model development, so it includes model producers and PFA manipulation tools in addition to runtime execution. Currently, it works for Python 2.
- [https://github.com/animator/titus2 Titus 2] (Python 3.x) - Titus 2 is a fork of Titus which supports PFA implementation for Python 3.{{Citation|last=Mahato|first=Ankit|title=Titus 2 : Portable Format for Analytics (PFA) implementation for Python 3.4+: animator/titus2|date=2019-11-21|url=https://github.com/animator/titus2|access-date=2019-11-22}}
- [https://cran.r-project.org/web/packages/aurelius/index.html Aurelius] (R) - Aurelius is a toolkit for generating PFA in the R programming language. It focuses on porting models to PFA from their R equivalents. To validate or execute scoring engines, Aurelius sends them to Titus through rPython (so both must be installed).
- [https://opendatagroup.github.io/Hadrian/Antinous-basics Antinous] (Model development in Jython) - Antinous is a model-producer plugin for Hadrian that allows Jython code to be executed anywhere a PFA scoring engine would go. It also has a library of model producing algorithms.
References
{{reflist}}
External links
- {{Official website|https://dmg.org/pfa/}}
- {{GitHub|datamininggroup/pfa}}
- [https://github.com/datamininggroup/pfa/releases/download/0.8.1/pfa-specification.pdf PFA 0.8.1 Specification]
- [http://dmg.org/pfa/docs/document_structure/ PFA Document Structure]
- [https://github.com/orgesleka/galois Python Multi-User Rest-Api Server for Deploying Portable Format For Analytics]
- [https://github.com/animator/titus2 Titus 2 : Portable Format for Analytics (PFA) implementation for Python 3]