_images/galyleo-logo.png

The Galyleo Python Client

The Galyleo Python client is a module designed to convert Python structures into Galyleo Tables, and send them to dashboards for use with the Galyleo editor. It consists of four components:

  • galyleo.galyleo_table: classes and methods to create GalyleoTables, convert Python data structures into them, and produce and read JSON versions of the tables.

  • galyleo.galyleo_jupyterlab_client: classes and methods to send Galyleo Tables to Galyleo dashboards runniung under JupyterLab clients

  • galyleo.galyleo_constants: Symbolic constants used by these packages and the code which uses them

  • galyleo.galyleo_exceptions; Exceptions thrown by the package

Installation

The galyleo module can be installed using pip:

pip install --extra-index-url https://pypi.engagelively.com galyleo

When the module is more thoroughly tested, it will be put on the standard pypi servers.

License

galyleo is released under a standard BSD 3-Clause licence by engageLively

Galyleo Table

class galyleo.galyleo_table.GalyleoTable(name: str)

A Galyleo Dashboard Table. Used to create a Galyleo Dashboard Table from any of a number of sources, and then generate an object that is suitable for storage (as a JSON file). A GalyleoTable is very similar to a Google Visualization data table, and can be converted to a Google Visualization Data Table on either the Python or the JavaScript side. Convenience routines provided here to import data from pandas, and json format.

aggregate_by(aggregate_column_names, new_column_name='count', new_table_name=None)

Create a new table by aggregating over multiple columns. The resulting table contains the aggregate column names and the new column name, and for each unique combination of values among the aggregate column names, the count of rows in this table with that unique combination of values. The new table will have name new_table_name Throws an InvalidDataException if aggregate_column_names is not a subset of the names in self.schema

Args:

aggregate_column_names: names of the columns to aggregate over new_column_name: name of the column for the aggregate count. Defaults to count new_table_name: name of the new table. If omitted, defaults to None, in which case a name will be generated

Returns:

A new table with name new_table_name, or a generated name if new_table_name == None

Throws:

InvalidDataException if one of the column names is missing

as_dictionary()

Return the form of the table as a dictionary. This is a dictionary of the form: {“name”: <table_name>,”table”: <table_struct>} where table_struct is of the form: {“columns”: [<list of schema records],”rows”: [<list of rows of the table>]}

A schema record is a record of the form: {“name”: < column_name>, “type”: <column_type}, where type is one of the Galyleo types (GALYLEO_STRING, GALYLEO_NUMBER, GALYLEO_BOOLEAN, GALYLEO_DATE, GALYLEO_DATETIME, GALYLEO_TIME_OF_DAY). All of these are defined in galyleo_constants.

Args:

None

Returns:

{“name”: <table_name>, “table”: {“columns”: <list of schema records], “rows”: [<list of rows of the table>]}}

equal(table, names_must_match=False)

Test to see if this table is equal to another table, passed as an argument. Two tables are equal if their schemas are the same length and column names and types match, and if the data is the same, and in the same order. If names_must_match == True (default is False), then the names must also match

Args:

table (GalyleoTable): table to be checked for equality names_must_match (bool): (default False) if True, table names must also match

Returns:

True if equal, False otherwise

filter_by_function(column_name, function, new_table_name, column_types={})

Create a new table, with name table_name, with rows such that function(row[column_name]) == True. The new table will have columns {self.columns} - {column_name}, same types, and same order Throws an InvalidDataException if: 1. new_table_name is None or not a string 2. column_name is not a name of an existing column 3. if column_types is not empty, the type of the selected column doesn’t match one of the allowed types

Args:

column_name: the column to filter by function: a Boolean function with a single argument of the type of columns[column_name] new_table_name: name of the new table column_types: set of the allowed column types; if empty, any type is permitted

Returns:

A table with column[column_name] missing and filtered

Throws:

InvalidDataException if new_table_name is empty, column_name is not a name of an existing column, or the type of column_name isn’t in column_types (if column_types is non-empty)

filter_equal(column_name, value, new_table_name, column_types)

A convenience method over filter_by_function. This is identical to filter_by_function(column_name, lambda x: x == value, new_table_name, column_types)

Args:

column_name: the column to filter by value: the value to march for equality new_table_name: name of the new table column_types: set of the allowed column types; if empty, any type is permitted

Returns:

A table with column[column_name] missing and filtered

Throws:

InvalidDataException if new_table_name is empty, column_name is not a name of an existing column, or the type of column_name isn’t in column_types (if column_types is non-empty)

filter_range(column_name, range_as_tuple, new_table_name, column_types)

A convenience method over filter_by_function. This is identical to filter_by_function(column_name, lambda x: x >= range_as_tuple[0], x <= range_as_tuple[1], new_table_name, column_types)

Args:

column_name: the column to filter by range_as_tupe: the tuple representing the range new_table_name: name of the new table column_types: set of the allowed column types; if empty, any type is permitted

Returns:

A table with column[column_name] missing and filtered

Throws:

InvalidDataException if new_table_name is empty, column_name is not a name of an existing column, or the type of column_name isn’t in column_types (if column_types is non-empty), if len(range_as_tuple) != 2

from_json(json_form, overwrite_name=True)

Load the table from a JSON string, of the form produced by toJSON(). Note that if the overwrite_name parameter = True (the default), this will also overwrite the table name.

Throws InvalidDataException id json_form is malformed

Args:

json_form: A JSON form of the Dictionary

Returns:

None

Throws:

InvalidDataException if json_form is malformed

load_from_dataframe(dataframe, schema=None)

Load from a Pandas Dataframe. The schema is given in the optional second parameter, as a list of records {“name”: <name>, “type”: <type>}, where type is a Galyleo type. (GALYLEO_STRING, GALYLEO_NUMBER, GALYLEO_BOOLEAN, GALYLEO_DATE, GALYLEO_DATETIME, GALYLEO_TIME_OF_DAY). If the second parameter is not present, the schema is derived from the name and column types of the dataframe, and each row of the dataframe becomes a row of the table.

Args:

dataframe (pandas dataframe): the pandas dataframe to load from schema (list of dictionaries): if present, the schema in list of dictionary form; each dictionary is of the form {“name”: <column name>, “type”: <column type>}

load_from_dictionary(dict)

load data from a dictionary of the form: {“columns”: [<list of schema records], “rows”: [<list of rows of the table>]}

A schema record is a record of the form: {“name”: < column_name>, “type”: <column_type}, where type is one of the Galyleo types (GALYLEO_STRING, GALYLEO_NUMBER, GALYLEO_BOOLEAN, GALYLEO_DATE, GALYLEO_DATETIME, GALYLEO_TIME_OF_DAY).

Throws InvalidDataException if the dictionary is of the wrong format or the rows don’t match the columns.

Args:

dict: the table as a dictionary (a value returned by as_dictionary)

Throws:

InvalidDataException if dict is malformed

load_from_schema_and_data(schema: list, data: list)

Load from a pair (schema, data). Schema is a list of pairs [(<column_name>, <column_type>)] where column_type is one of the Galyleo types (GALYLEO_STRING, GALYLEO_NUMBER, GALYLEO_BOOLEAN, GALYLEO_DATE, GALYLEO_DATETIME, GALYLEO_TIME_OF_DAY). All of these are defined in galyleo_constants. data is a list of lists, where each list is a row of the table. Two conditions:

  1. Each type must be one of types listed above

  2. Each list in data must have the same length as the schema, and the type of each element must match the corresponding schema type

throws an InvalidDataException if either of these are violated

Args:

schema (list of pairs, (name, type)): the schema as a list of pairs data (list of lists): the data as a list of lists

pivot_on_column(pivot_column_name, value_column_name, new_table_name, pivot_column_values={}, other_column=False)

The pivot_on_column method breaks out value_column into n separate columns, one for each member of pivot_column_values plus (if other_column = True), an “Other” column. This is easiest to see with an example. Consider a table with columns (Year, State, Party, Percentage). pivot_on_column(‘Party’, {‘Republican’, ‘Democratic’}, ‘Percentage’, ‘pivot_table’, False) would create a new table with columns Year, State, Republican, Democratic, where the values in the Republican and Democratic columns are the values in the Percentage column where the Party column value was Republican or Democratic, respectively. If Other = True, an additional column, Other, is found where the value is (generally) the sum of values where Party not equal Republican or Democratic

Args:

pivot_column_name: the column holding the keys to pivot on value_column_name: the column holding the values to spread out over the pivots new_table_name: name of the new table pivot_column_values: the values to pivot on. If empty, all values used other_column: if True, aggregate other values into a column

Returns:

A table as described in the comments above

Throws:

InvalidDataException if new_table_name is empty, pivot_column_name is not a name of an existing column, or value_column_name is not the name of an existing column

to_json()

Return the table as a JSON string, suitable for transmitting as a message or saving to a file. This is just a JSON form of the dictionary form of the string. (See as_dictionary)

Returns:

as_dictionary() as a JSON string

JupyterLab Client

class galyleo.galyleo_jupyterlab_client.GalyleoClient

The Dashboard Client. This is the client which sends the tables to the dashboard and handles requests coming from the dashboard for tables.

send_data_to_dashboard(galyleo_table, dashboard_name: Optional[str] = None)None

The routine to send a GalyleoTable to the dashboard, optionally specifying a specific dashboard to send the data to. If None is specified, sends to all the dashboards. The table must not have more than galyleo_constants.MAX_NUMBER_ROWS, nor be (in JSON form) > galyleo_constants.MAX_DATA_SIZE. If either of these conditions apply, a DataSizeExceeded exception is thrown. NOTE: this sends data to one or more open dashboard editors in JupyterLab. If there are no dashboard editors open, it will have no effect.

Args:

galyleo_table: the table to send to the dashboard dashboard_name: name of the dashboard editor to send it to (if None, sent to all)

Galyleo Exceptions

Galyleo specific exceptions

exception galyleo.galyleo_exceptions.DataSizeExceeded

Raised when the data volume is too large on a single request. The exact limitations are specified in README.md and in galyleo_constants

exception galyleo.galyleo_exceptions.DataSizeIsZero

Raised when the data set is empty.

exception galyleo.galyleo_exceptions.Error

Base class for other exceptions.

exception galyleo.galyleo_exceptions.InvalidDataException

An exception thrown when a data table (list of rows) doesn’t match an accoompanying schema, or a bad schema is specified, or a table row is the wrong length, or..

Galyleo Constants

Constants that are used throughout the module.  These include:
  1. Data types for a table (GALYLEO_STRING, GALYLEO_NUMBER, GALYLEO_BOOLEAN, GALYLEO_DATE, GALYLEO_DATETIME, GALYLEO_TIME_OF_DAY)

  2. GALYLEO_TYPES: The types in a list

  3. MAXIMUM_DATA_SIZE: Maximum size, in bytes, of a GalyleoTable

  4. MAX_TABLE_ROWS: Maximum number of rows in a GalyleoTable

galyleo.galyleo_constants.GALYLEO_SCHEMA_TYPES = ['string', 'number', 'boolean', 'date', 'datetime', 'timeofday']

Maximum size of a table being sent to the dashoard. Exceeding this will throw a DataSizeExceeded exception

galyleo.galyleo_constants.GALYLEO_TIME_OF_DAY = 'timeofday'

Types for a chart/dashboard table schema

galyleo.galyleo_constants.MAX_DATA_SIZE = 16777216

Maximum number of rows in a table