ptrail.features package

Submodules

ptrail.features.contextual_features module

The semantic features module contains several semantic features like intersection of trajectories, stop and stay point detection. Moreover, features like distance from Point of Interests, water bodies and other demographic features related to the trajectory data are calculated. The demographic features are extracted with the help of the python osmnx library.

Authors: Yaksh J Haranwala, Salman Haidri
class ptrail.features.contextual_features.ContextualFeatures[source]

Bases: object

static nearest_poi(coords: tuple, dist_threshold, tags: dict)[source]

Given a coordinate point and a distance threshold, find the Point of Interest which is the nearest to it within the given distance threshold.

Warning

The users are advised the be mindful about the tags being passed in as parameter. More the number of tags, longer will the OSMNx library take to download the information from the OpenStreetNetwork maps. Moreover, an active internet connection is also required to execute this function.

Note

If several tags (POIs) are given in, then the method will find the closest one based on the distance and return it and will not given out the others that may or may not be present within the threshold of the given point.

Parameters
  • coords (tuple) – The point near which the bank is to be found.

  • dist_threshold – The maximum distance from the point within which the distance is to be calculated.

  • tags (dict) – The dictionary containing tags of Points of interest.

Returns

A pandas DF containing the info about the nearest bank from the given point.

Return type

pandas.core.dataframe.DataFrame

Raises

JSONDecodeError: – One or more given tags are invalid.

static traj_intersect_inside_polygon(df1: ptrail.core.TrajectoryDF.PTRAILDataFrame, df2: ptrail.core.TrajectoryDF.PTRAILDataFrame, polygon: shapely.geometry.Polygon)[source]

Given a df1 and df2 containing trajectory data along with polygon, check whether the trajectory/trajectories are inside the polygon and if they are, whether the intersect at any point or not.

Warning

While creating a polygon, the format of the coordinates is: (longitude, latitude) instead of (latitude, longitude). Beware of that, otherwise the results will be incorrect.

Note

It is to be noted that df1 and df2 should only contain trajectory data of only one trajectory each. If they contain more than one trajectories, then the results might be unexpected.

Parameters
  • df1 (PTRAILDataFrame) – Trajectory Dataframe 1.

  • df2 (PTRAILDataFrame) – Trajectory Dataframe 2.

  • polygon (Polygon) – The area inside which it is to be determined if the trajectories intersect or not.

Returns

  • PTRAILDataFrame – A dataframe containing trajectories that are inside the polygon.

  • geopandas.GeoDataFrame – An empty dataframe if both the trajectories do not intersect.

static trajectories_inside_polygon(df: ptrail.core.TrajectoryDF.PTRAILDataFrame, polygon: shapely.geometry.Polygon)[source]

Given a trajectory dataframe and a Polygon, find out all the trajectories that are inside the given polygon.

Warning

While creating a polygon, the format of the coordinates is: (longitude, latitude) instead of (latitude, longitude). Beware of that, otherwise the results will be incorrect.

Parameters
  • df (PTRAILDataFrame) – The dataframe containing the trajectory data.

  • polygon (Polygon) – The polygon inside which the points are to be found.

Returns

A dataframe containing trajectories that are inside the polygon.

Return type

PTRAILDataFrame

static visited_location(df: ptrail.core.TrajectoryDF.PTRAILDataFrame, geo_layers: Union[pandas.DataFrame, geopandas.GeoDataFrame], visited_location_name: str, location_column_name: str)[source]

Create a column called visited_Location for all the pastures present in the dataset.

Warning

While using this method, make sure that the geo_layers parameter dataframe that is being passed into the method has Latitude and Longitude columns with columns named as ‘lat’ and ‘lon’ respectively. If this format is not followed then a KeyError will be thrown.

Note

It is to be noted that depending on the size of the dataset and the surrounding data passed in, this function will take longer time to execute if either of the datasets is very large. It has been parallelized to make it faster, however, it can still take a longer time depending on the size of the data being analyzed.

Parameters
  • df (PTRAILDataFrame) – The dataframe containing the dataset.

  • geo_layers (Union[pd.DataFrame, gpd.GeoDataFrame]) – The Dataframe containing the geographical layers near the trajectory data. It is to be noted

  • visited_location_name (Text) – The location for which it is to be checked whether the objected visited it or not.

  • location_column_name (Text) – The name of the column that contains the location to be checked.

Returns

The Dataframe containing a new column indicating whether the animal has visited the pasture or not.

Return type

PTRAILDataFrame

Raises

KeyError: – The column or the location name does not exist.

static visited_poi(df: ptrail.core.TrajectoryDF.PTRAILDataFrame, surrounding_data: Union[geopandas.GeoDataFrame, pandas.DataFrame, ptrail.core.TrajectoryDF.PTRAILDataFrame], dist_column_label: str, nearby_threshold: int)[source]

Given a surrounding data with information about the distance to the nearest POI source from a given coordinate, check whether the objects in the given trajectory data have visited/crossed those POIs or not

Warning

It is to be noted that for this method to work, the surrounding dataset NEEDS to have a column containing distance to the nearest POI. For more info, see the Starkey habitat dataset which has the columns like ‘DistCWat’ and ‘DistEWat’.

Parameters
  • df (PTRAILDataFrame) – The dataframe containing the trajectory data.

  • surrounding_data (Union[gpd.GeoDataFrame, pd.DataFrame]) – The surrounding data that needs to contain the information of distance to the nearest water body.

  • dist_column_label (Text) – The name of the column containing the distance information.

  • nearby_threshold (int) – The maximum distance between the POI and the current location of the object within which the object is considered to be crossing/visiting the POI.

Returns

The dataframe containing the new column indicating whether the object at that point is near.

Return type

PTRAILDataFrame

ptrail.features.helper_functions module

This module contains all the helper functions for the parallel calculations in the spatial and temporal features classes.

Warning

These functions should not be used directly as they would result in a slower calculation and execution times. In some cases, these functions might even yield wrong results if used directly. They are meant to be used only as helpers. For calculation of features, use the ones in the features package.

Authors: Yaksh J Haranwala, Salman Haidri
class ptrail.features.helper_functions.Helpers[source]

Bases: object

static bearing_helper(dataframe)[source]

This function is the helper function of the create_bearing_column(). The create_bearing_column() delegates the task of calculation of bearing between 2 points to this function because the original functions runs multiple instances of this function in parallel. This function does the calculation of bearing between 2 consecutive points in the entire DF and then creates a column in the dataframe and returns it.

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which the calculation is to be done.

Returns

The dataframe containing the Bearing column.

Return type

PTRAILDataFrame

static distance_between_consecutive_helper(dataframe)[source]

This function is the helper function of the create_distance_between_consecutive_column() function. The create_distance_between_consecutive_column() function delegates the actual task of calculating the distance between 2 consecutive points. This function does the calculation and creates a column called Distance_prev_to_curr and places it in the dataframe and returns it.

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which calculation is to be performed.

Returns

The dataframe containing the resultant Distance_prev_to_curr column.

Return type

core.TrajectoryDF.PTRAILDataFrame

References

Arina De Jesus Amador Monteiro Sanches. ‘Uma Arquitetura E Imple-menta ̧c ̃ao Do M ́odulo De Pr ́e-processamento Para Biblioteca Pymove’.Bachelor’s thesis. Universidade Federal Do Cear ́a, 2019.

static distance_from_given_point_helper(dataframe, coordinates)[source]

This function is the helper function of the create_distance_from_point() function. The create_distance_from_point() function delegates the actual task of calculating distance between the given point to all the points in the dataframe to this function. This function calculates the distance and creates another column called ‘Distance_to_specified_point’.

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe on which calculation is to be done.

  • coordinates (tuple) – The coordinates from which the distance is to be calculated.

Returns

The dataframe containing the resultant Distance_from_(x, y) column.

Return type

pandas.core.dataframe.DataFrame

static distance_from_start_helper(dataframe)[source]

This function is the helper function of the create_distance_from_start_column() function. The create_distance_from_start_column() function delegates the actual task of calculating the distance between 2 the start point of the trajectory to the current point.This function does the calculation and creates a column called Distance_start_to_curr and places it in the dataframe and returns it.

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which calculation is to be performed.

Returns

The dataframe containing the resultant Distance_start_to_curr column.

Return type

pandas.core.dataframe

static end_location_helper(dataframe, ids_)[source]

This function is the helper function of the get_end_location(). The get_end_location() function delegates the task of calculating the end location of the trajectories in the dataframe because the original functions runs multiple instances of this function in parallel. This function finds the end location of the specified trajectory IDs the DF and then another returns dataframe containing end latitude, end longitude and trajectory ID for each trajectory

dataframe: PTRAILDataFrame

The dataframe of which the locations are to be found.dataframe

ids_: list

List of trajectory ids for which the end locations are to be calculated

Returns

New dataframe containing Trajectory ID as index and latitude and longitude as other 2 columns.

Return type

pandas.core.dataframe.Dataframe

static end_time_helper(dataframe, ids_)[source]

This function is the helper function of the get_end_time(). The get_end_time() function delegates the task of calculating the end_time of the trajectories in the dataframe because the original functions runs multiple instances of this function in parallel. This function finds the end time of the specified trajectory IDs the DF and then another returns dataframe containing end latitude, end longitude, DateTime and trajectory ID for each trajectory

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe containing the original data.

  • ids (list) – List of trajectory ids for which the end times are to be calculated

Returns

New dataframe containing Trajectory ID as index end time of all trajectories.

Return type

pandas.core.dataframe.Dataframe

static number_of_location_helper(dataframe, ids_)[source]

This is the helper function for the get_number_of_locations() function. The get_number_of_locations() delegates the actual task of calculating the number of unique locations visited by a particular object to this function. This function calculates the number of unique locations by each of the unique object and returns a dataframe containing the results.

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe containing all the original data.

  • ids (list) – The list of ids for which the number of unique locations visited is to be calculated.

Returns

dataframe containing the results.

Return type

pandas.core.dataframe.DataFrame

static point_within_range_helper(dataframe, coordinates, dist_range)[source]

This is the helper function for create_point_within_range() function. The create_point_within_range_column()

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe on which the operation is to be performed.

  • coordinates (tuple) – The coordinates from which the distance is to be checked.

  • dist_range – The range within which the distance from the coordinates should lie.

Returns

The dataframe containing the resultant Within_X_m_from_(x,y) column.

Return type

pandas.core.dataframe.DataFrame

static start_location_helper(dataframe, ids_)[source]

This function is the helper function of the get_start_location(). The get_start_location() function delegates the task of calculating the start location of the trajectories in the dataframe because the original functions runs multiple instances of this function in parallel. This function finds the start location of the specified trajectory IDs the DF and then another returns dataframe containing start latitude, start longitude and trajectory ID for each trajectory

dataframe: PTRAILDataFrame

The dataframe of which the locations are to be found.dataframe

ids_: list

List of trajectory ids for which the start locations are to be calculated

Returns

New dataframe containing Trajectory as index and latitude and longitude

Return type

pandas.core.dataframe.Dataframe

static start_time_helper(dataframe, ids_)[source]

This function is the helper function of the get_start_time(). The get_start_time() function delegates the task of calculating the start_time of the trajectories in the dataframe because the original functions runs multiple instances of this function in parallel. This function finds the start time of the specified trajectory IDs the DF and then another returns dataframe containing start latitude, start longitude, DateTime and trajectory ID for each trajectory

dataframe: PTRAILDataFrame

The dataframe containing the original data.

ids_: list

List of trajectory ids for which the start times are to be calculated

Returns

New dataframe containing Trajectory ID as index and start time of all trajectories.

Return type

pandas.core.dataframe.Dataframe

static traj_duration_helper(dataframe, ids_)[source]

Calculate the duration of the trajectory i.e. subtract the max time of the trajectory by the min time of the trajectory.

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe containing all the original data.

  • ids (list) – A list containing all the Trajectory IDs present in the dataset.

Returns

The resultant dataframe containing all the trajectory durations.

Return type

pandas.core.dataframe.DataFrame

static visited_poi_helper(df, surrounding_data, dist_column_label, nearby_threshold)[source]

Given a Trajectory dataframe and another dataset with the surrounding data, find whether the given object is nearby a point of interest or not.

Parameters
  • df – The dataframe containing the trajectory data.

  • surrounding_data – The dataframe containing the data of the surroundings.

  • dist_column_label (Text) – The label of the column containing the distance of the coords from the nearest POI.

  • nearby_threshold (int) – The maximum distance between the POI and the current location of the object within which the object is considered to be crossing/visiting the POI.

Returns

  • The original dataframe with another column added to it indicating whether

  • each point is within

ptrail.features.kinematic_features module

The spatial_features module contains several functions of the library that calculates kinematic features based on the coordinates of points provided in the data. This module mostly extracts and modifies data collected from some existing dataframe and appends these information to them. Inspiration of lots of functions in this module is taken from the PyMove library.

Authors: Yaksh J Haranwala, Salman Haidri

References

Arina De Jesus Amador Monteiro Sanches. “Uma Arquitetura E Imple-menta ̧c ̃ao Do M ́odulo De Pr ́e-processamento Para Biblioteca Pymove”.Bachelor’s thesis. Universidade Federal Do Cear ́a, 2019

class ptrail.features.kinematic_features.KinematicFeatures[source]

Bases: object

static create_acceleration_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Create a column containing acceleration of the object from the previous to the current point.

Note

The acceleration calculated here is the acceleration between 2 consecutive points of the same trajectory. Furthermore, the acceleration yielded is in metres/second^2 (m/s^2).

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which the calculation of acceleration is to be done.

Returns

The dataframe containing the resultant Acceleration_prev_to_curr column.

Return type

PTRAILDataFrame

static create_bearing_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Create a column containing bearing between 2 consecutive points. Bearing is also referred as “Forward Azimuth” sometimes. Bearing/Forward Azimuth is defined as follows:

Bearing is the horizontal angle between the direction of an object and another object, or between the object and the True North.

Note

The bearing calculated here is the bearing between 2 consecutive points of the same trajectory. Furthermore, the bearing yielded is in degrees.

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which the bearing is to be calculated.

Returns

The dataframe containing the resultant Bearing_from_prev column.

Return type

PTRAILDataFrame

static create_bearing_rate_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Calculates the bearing rate of the consecutive points. And adding that column into the dataframe

Note

The bearing calculated here is the bearing between 2 consecutive points of the same trajectory. Furthermore, the bearing yielded is in degrees/second.

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which the bearing rate is to be calculated

Returns

The dataframe containing the resultant Bearing_rate_from_prev column.

Return type

PTRAILDataFrame

static create_distance_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Create a column called Dist_prev_to_curr containing distance between 2 consecutive points. The distance calculated is the Great-Circle (Haversine) distance.

Note

When the trajectory ID changes in the data, then the distance calculation again starts from the first point of the new trajectory ID and the distance-value of the first point of the new Trajectory ID will be set to 0.

Note

The Distance calculated here is the distance between 2 consecutive points of the same trajectory. Furthermore, the distance yielded is in metres (m).

Parameters

dataframe (PTRAILDataFrame) – The data where distance is to be calculated.

Returns

The dataframe containing the resultant Distance_prev_to_curr column.

Return type

PTRAILDataFrame

static create_distance_from_point_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, coordinates: tuple)[source]

Given a point, this function calculates the distance between that point and all the points present in the dataframe and adds that column into the dataframe.

Note

The distance yielded here is in metres.

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe on which calculation is to be done.

  • coordinates (tuple) – The coordinates from which the distance is to be calculated.

Returns

The dataframe containing the resultant Distance_from_(x, y) column.

Return type

PTRAILDataFrame

static create_distance_from_start_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Create a column containing distance between the start location and the rest of the points using Haversine formula. The distance calculated is the Great-Circle distance.

Note

When the trajectory ID changes in the data, then the distance calculation again starts from the first point of the new trajectory ID and the first distance of the new trajectory ID will be set to 0.

Note

The Distance calculated here is the distance between the start point and the current points of the same trajectory. Furthermore, the distance yielded is in metres (m).

Parameters

dataframe (PTRAILDataFrame) – The data where distance is to be calculated.

Returns

The dataframe containing the resultant Distance_start_to_curr column.

Return type

PTRAILDataFrame

static create_jerk_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Create a column containing jerk of the object from previous to the current point.

Note

The jerk calculated here is the jerk between 2 consecutive points of the same trajectory. Furthermore, the jerk yielded is in metres/second^3 (m/s^3).

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which the calculation of jerk is to be done.

Returns

The dataframe containing the resultant jerk_prev_to_curr column.

Return type

PTRAILDataFrame

static create_point_within_range_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, coordinates: tuple, dist_range: float)[source]

Check how many points are within the range of the given coordinate by first making a column containing the distance between the given coordinate and rest of the points in dataframe by calling create_distance_from_point() and then comparing each point using the condition if it’s within the range and appending the values in a column and attaching it to the dataframe.

Note

The dist_range parameter is given in metres.

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe on which the point within range calculation is to be done.

  • coordinates (tuple) – The coordinates from which the distance is to be calculated.

  • dist_range (float) – The range within which the resultant distance from the coordinates should lie.

Returns

The dataframe containing the resultant Within_x_m_from_(x,y) column.

Return type

PTRAILDataFrame

static create_rate_of_br_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Calculates the rate of bearing rate of the consecutive points. And then adding that column into the dataframe.

Note

The rate of bearing rate calculated here is the rate of bearing rate between 2 consecutive points of the same trajectory. Furthermore, the bearing yielded is in degrees.

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which the rate of bearing rate is to be calculated

Returns

The dataframe containing the resultant Rate_of_bearing_rate_from_prev column

Return type

PTRAILDataFrame

static create_speed_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Create a column containing speed of the object from the previous point to the current point.

Note

When the trajectory ID changes in the data, then the speed calculation again starts from the first point of the new trajectory ID and the speed of the first point of the new trajectory ID will be set to 0.

Note

The Speed calculated here is the speed between 2 consecutive points of the same trajectory. Furthermore, the speed yielded is in metres/second (m/s).

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which the calculation of speed is to be done.

Returns

The dataframe containing the resultant Speed_prev_to_curr column.

Return type

PTRAILDataFrame

static distance_travelled_by_date_and_traj_id(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, date, traj_id)[source]

Given a date and trajectory ID, calculate the total distance covered in the trajectory on that particular date.

Note

The distance yielded is in metres (m).

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe in which teh actual data is stored.

  • date (Text) – The Date on which the distance covered is to be calculated.

  • traj_id (Text) – The trajectory ID for which the distance covered is to be calculated.

Returns

The total distance covered on that date by that trajectory ID.

Return type

float

Raises

KeyError: – Traj_id is not present in the arguments passed.

static generate_kinematic_features(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Generate all the Kinematic features with a single call of this function.

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which the features are to be generated.

Returns

The dataframe enriched with Kinematic Features.

Return type

PTRAILDataFrame

static get_bounding_box(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Return the bounding box of the Trajectory data. Essentially, the bounding box is of the following format:

(min Latitude, min Longitude, max Latitude, max Longitude).

Parameters

dataframe (PTRAILDataFrame) – The dataframe containing the trajectory data.

Returns

The bounding box of the trajectory

Return type

tuple

static get_distance_travelled_by_traj_id(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: str)[source]

Given a trajectory ID, calculate the total distance covered by the trajectory. NOTE: The distance calculated is in metres.

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe containing the entire dataset.

  • traj_id (Text) – The trajectory ID for which the distance covered is to be calculated.

Returns

The distance covered by the trajectory

Return type

float

Raises

MissingTrajIDException: – The Trajectory ID given by the user is not present in the dataset.

static get_end_location(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: Optional[str] = None)[source]

Get the ending location of an object’s trajectory in the data.

Note

If the user does not give in any traj_id, then the library, by default gives out the end locations of all the unique trajectory ids present in the data.

Parameters
  • dataframe (PTRAILDataFrame) – The PTRAILDataFrame storing the trajectory data.

  • traj_id – The ID of the trajectory whose end location is to be found.

Returns

  • tuple – The (lat, longitude) tuple containing the end location.

  • pandas.core.dataframe.DataFrame – The dataframe containing start locations of all trajectory IDs.

static get_number_of_locations(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: Optional[str] = None)[source]

Get the number of unique coordinates in the dataframe specific to a trajectory ID.

Note

If no Trajectory ID is specified, then the number of unique locations in the visited by each trajectory in the dataset is calculated.

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe of which the number of locations are to be computed

  • traj_id (Text) – The trajectory id for which the number of unique locations are to be found

Returns

  • int – The number of unique locations in the dataframe/trajectory id.

  • pandas.core.dataframe.DataFrame – The dataframe containing start locations of all trajectory IDs.

static get_start_location(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id=None)[source]

Get the starting location of an object’s trajectory in the data.

Note

If the user does not give in any traj_id, then the library, by default gives out the start locations of all the unique trajectory ids present in the data.

Parameters
  • dataframe (PTRAILDataFrame) – The PTRAILDataFrame storing the trajectory data.

  • traj_id – The ID of the object whose start location is to be found.

Returns

  • tuple – The (lat, longitude) tuple containing the start location.

  • pandas.core.dataframe.DataFrame – The dataframe containing start locations of all trajectory IDs.

ptrail.features.temporal_features module

1. The temporal_features module contains all the features of the library that calculates several features based on the DateTime provided in the data.
2. It is to be noted that most of the functions in this module calculate the features and then add the results to an entirely new column with a new column header.
3. It is to be also noted that a lot of these features are inspired from the PyMove library and we are crediting the PyMove creators with them.
Authors: Yaksh J Haranwala, Salman Haidri

References

Arina De Jesus Amador Monteiro Sanches. “Uma Arquitetura E Imple-menta ̧c ̃ao Do M ́odulo De Pr ́e-processamento Para Biblioteca Pymove”.Bachelor’s thesis. Universidade Federal Do Cear ́a, 2019

class ptrail.features.temporal_features.TemporalFeatures[source]

Bases: object

static create_date_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

From the DateTime column already present in the data, extract only the date and then add another column containing just the date.

Parameters

dataframe (PTRAILDataFrame) – The PTRAILDataFrame Dataframe on which the creation of the time column is to be done.

Returns

The dataframe containing the resultant Date column.

Return type

PTRAILDataFrame

static create_day_of_week_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Create a column called Day_Of_Week which contains the day of the week on which the trajectory point is recorded. This is calculated on the basis of timestamp recorded in the data.

Parameters

dataframe (PTRAILDataFrame) – The dataframe containing the entire data on which the operation is to be performed

Returns

The dataframe containing the resultant Day_of_week column.

Return type

PTRAILDataFrame

static create_time_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

From the DateTime column already present in the data, extract only the time and then add another column containing just the time.

Parameters

dataframe (PTRAILDataFrame) – The PTRAILDataFrame Dataframe on which the creation of the time column is to be done.

Returns

The dataframe containing the resultant Time column.

Return type

PTRAILDataFrame

static create_time_of_day_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Create a Time_Of_Day column in the dataframe using parallelization which indicates at what time of the day was the point data captured. Note: The divisions of the day based on the time are provided in the utilities.constants module.

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which the calculation is to be done.

Returns

The dataframe containing the resultant Time_Of_Day column.

Return type

PTRAILDataFrame

References

Arina De Jesus Amador Monteiro Sanches. ‘Uma Arquitetura E Imple-menta ̧c ̃ao Do M ́odulo De Pr ́e-processamento Para Biblioteca Pymove’.Bachelor’s thesis. Universidade Federal Do Cear ́a, 2019.

static create_weekend_indicator_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Create a column called Weekend which indicates whether the point data is collected on either a Saturday or a Sunday.

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which the operation is to be performed.

Returns

The dataframe containing the resultant Weekend column.

Return type

PTRAILDataFrame

References

Arina De Jesus Amador Monteiro Sanches. ‘Uma Arquitetura E Imple-menta ̧c ̃ao Do M ́odulo De Pr ́e-processamento Para Biblioteca Pymove’.Bachelor’s thesis. Universidade Federal Do Cear ́a, 2019.

static generate_temporal_features(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]

Generate all the temporal features with a single call of this function.

Parameters

dataframe (PTRAILDataFrame) – The dataframe on which the features are to be generated.

Returns

The dataframe enriched with Temporal Features.

Return type

PTRAILDataFrame

static get_end_time(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: Optional[str] = None)[source]

Get the ending time of the trajectory.

Note

If the trajectory ID is not specified by the user, then by default, the ending times of all the trajectory IDs in the data are returned.

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe on which the operations are to be performed.

  • traj_id (Optional[Text]) – The trajectory for which the end time is required.

Returns

  • pandas.DateTime – The end time of a single trajectory.

  • pandas.core.dataframe.DataFrame – Pandas dataframe containing the end time of all the trajectories present in the data when the user hasn’t asked for a particular trajectory’s end time.

static get_start_time(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: Optional[str] = None)[source]

Get the starting time of the trajectory.

Note

If the trajectory ID is not specified by the user, then by default, the starting times of all the trajectory IDs in the data are returned.

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe on which the operations are to be performed.

  • traj_id (Optional[Text]) – The trajectory for which the start time is required.

Returns

  • pandas.DateTime – The start time of a single trajectory.

  • pandas.core.dataframe.DataFrame – Pandas dataframe containing the start time of all the trajectories present in the data when the user hasn’t asked for a particular trajectory’s start time.

static get_traj_duration(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: Optional[str] = None)[source]

Accessor method for the duration of a trajectory specified by the user.

Note

If no trajectory ID is given by the user, then the duration of each unique trajectory is calculated.

Parameters
  • dataframe (PTRAILDataFrame) – The dataframe containing the resultant column if inplace is True.

  • traj_id (Optional[Text]) – The trajectory id for which the duration is required.

Returns

  • pandas.TimeDelta – The trajectory duration.

  • pandas.core.dataframe.DataFrame – The dataframe containing the duration of all trajectories in the dataset.

Module contents