ptrail.features package
Submodules
ptrail.features.contextual_features module
The semantic features module contains several semantic features like intersection of trajectories, stop and stay point detection. Moreover, features like distance from Point of Interests, water bodies and other demographic features related to the trajectory data are calculated. The demographic features are extracted with the help of the python osmnx library.
- class ptrail.features.contextual_features.ContextualFeatures[source]
Bases:
object
- static nearest_poi(coords: tuple, dist_threshold, tags: dict)[source]
Given a coordinate point and a distance threshold, find the Point of Interest which is the nearest to it within the given distance threshold.
Warning
The users are advised the be mindful about the tags being passed in as parameter. More the number of tags, longer will the OSMNx library take to download the information from the OpenStreetNetwork maps. Moreover, an active internet connection is also required to execute this function.
Note
If several tags (POIs) are given in, then the method will find the closest one based on the distance and return it and will not given out the others that may or may not be present within the threshold of the given point.
- Parameters
coords (tuple) – The point near which the bank is to be found.
dist_threshold – The maximum distance from the point within which the distance is to be calculated.
tags (dict) – The dictionary containing tags of Points of interest.
- Returns
A pandas DF containing the info about the nearest bank from the given point.
- Return type
pandas.core.dataframe.DataFrame
- Raises
JSONDecodeError: – One or more given tags are invalid.
- static traj_intersect_inside_polygon(df1: ptrail.core.TrajectoryDF.PTRAILDataFrame, df2: ptrail.core.TrajectoryDF.PTRAILDataFrame, polygon: shapely.geometry.Polygon)[source]
Given a df1 and df2 containing trajectory data along with polygon, check whether the trajectory/trajectories are inside the polygon and if they are, whether the intersect at any point or not.
Warning
While creating a polygon, the format of the coordinates is: (longitude, latitude) instead of (latitude, longitude). Beware of that, otherwise the results will be incorrect.
Note
It is to be noted that df1 and df2 should only contain trajectory data of only one trajectory each. If they contain more than one trajectories, then the results might be unexpected.
- Parameters
df1 (PTRAILDataFrame) – Trajectory Dataframe 1.
df2 (PTRAILDataFrame) – Trajectory Dataframe 2.
polygon (Polygon) – The area inside which it is to be determined if the trajectories intersect or not.
- Returns
PTRAILDataFrame – A dataframe containing trajectories that are inside the polygon.
geopandas.GeoDataFrame – An empty dataframe if both the trajectories do not intersect.
- static trajectories_inside_polygon(df: ptrail.core.TrajectoryDF.PTRAILDataFrame, polygon: shapely.geometry.Polygon)[source]
Given a trajectory dataframe and a Polygon, find out all the trajectories that are inside the given polygon.
Warning
While creating a polygon, the format of the coordinates is: (longitude, latitude) instead of (latitude, longitude). Beware of that, otherwise the results will be incorrect.
- Parameters
df (PTRAILDataFrame) – The dataframe containing the trajectory data.
polygon (Polygon) – The polygon inside which the points are to be found.
- Returns
A dataframe containing trajectories that are inside the polygon.
- Return type
- static visited_location(df: ptrail.core.TrajectoryDF.PTRAILDataFrame, geo_layers: Union[pandas.DataFrame, geopandas.GeoDataFrame], visited_location_name: str, location_column_name: str)[source]
Create a column called visited_Location for all the pastures present in the dataset.
Warning
While using this method, make sure that the geo_layers parameter dataframe that is being passed into the method has Latitude and Longitude columns with columns named as ‘lat’ and ‘lon’ respectively. If this format is not followed then a KeyError will be thrown.
Note
It is to be noted that depending on the size of the dataset and the surrounding data passed in, this function will take longer time to execute if either of the datasets is very large. It has been parallelized to make it faster, however, it can still take a longer time depending on the size of the data being analyzed.
- Parameters
df (PTRAILDataFrame) – The dataframe containing the dataset.
geo_layers (Union[pd.DataFrame, gpd.GeoDataFrame]) – The Dataframe containing the geographical layers near the trajectory data. It is to be noted
visited_location_name (Text) – The location for which it is to be checked whether the objected visited it or not.
location_column_name (Text) – The name of the column that contains the location to be checked.
- Returns
The Dataframe containing a new column indicating whether the animal has visited the pasture or not.
- Return type
- Raises
KeyError: – The column or the location name does not exist.
- static visited_poi(df: ptrail.core.TrajectoryDF.PTRAILDataFrame, surrounding_data: Union[geopandas.GeoDataFrame, pandas.DataFrame, ptrail.core.TrajectoryDF.PTRAILDataFrame], dist_column_label: str, nearby_threshold: int)[source]
Given a surrounding data with information about the distance to the nearest POI source from a given coordinate, check whether the objects in the given trajectory data have visited/crossed those POIs or not
Warning
It is to be noted that for this method to work, the surrounding dataset NEEDS to have a column containing distance to the nearest POI. For more info, see the Starkey habitat dataset which has the columns like ‘DistCWat’ and ‘DistEWat’.
- Parameters
df (PTRAILDataFrame) – The dataframe containing the trajectory data.
surrounding_data (Union[gpd.GeoDataFrame, pd.DataFrame]) – The surrounding data that needs to contain the information of distance to the nearest water body.
dist_column_label (Text) – The name of the column containing the distance information.
nearby_threshold (int) – The maximum distance between the POI and the current location of the object within which the object is considered to be crossing/visiting the POI.
- Returns
The dataframe containing the new column indicating whether the object at that point is near.
- Return type
ptrail.features.helper_functions module
This module contains all the helper functions for the parallel calculations in the spatial and temporal features classes.
Warning
These functions should not be used directly as they would result in a slower calculation and execution times. In some cases, these functions might even yield wrong results if used directly. They are meant to be used only as helpers. For calculation of features, use the ones in the features package.
- class ptrail.features.helper_functions.Helpers[source]
Bases:
object
- static bearing_helper(dataframe)[source]
This function is the helper function of the create_bearing_column(). The create_bearing_column() delegates the task of calculation of bearing between 2 points to this function because the original functions runs multiple instances of this function in parallel. This function does the calculation of bearing between 2 consecutive points in the entire DF and then creates a column in the dataframe and returns it.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the calculation is to be done.
- Returns
The dataframe containing the Bearing column.
- Return type
- static distance_between_consecutive_helper(dataframe)[source]
This function is the helper function of the create_distance_between_consecutive_column() function. The create_distance_between_consecutive_column() function delegates the actual task of calculating the distance between 2 consecutive points. This function does the calculation and creates a column called Distance_prev_to_curr and places it in the dataframe and returns it.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which calculation is to be performed.
- Returns
The dataframe containing the resultant Distance_prev_to_curr column.
- Return type
References
Arina De Jesus Amador Monteiro Sanches. ‘Uma Arquitetura E Imple-menta ̧c ̃ao Do M ́odulo De Pr ́e-processamento Para Biblioteca Pymove’.Bachelor’s thesis. Universidade Federal Do Cear ́a, 2019.
- static distance_from_given_point_helper(dataframe, coordinates)[source]
This function is the helper function of the create_distance_from_point() function. The create_distance_from_point() function delegates the actual task of calculating distance between the given point to all the points in the dataframe to this function. This function calculates the distance and creates another column called ‘Distance_to_specified_point’.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which calculation is to be done.
coordinates (tuple) – The coordinates from which the distance is to be calculated.
- Returns
The dataframe containing the resultant Distance_from_(x, y) column.
- Return type
pandas.core.dataframe.DataFrame
- static distance_from_start_helper(dataframe)[source]
This function is the helper function of the create_distance_from_start_column() function. The create_distance_from_start_column() function delegates the actual task of calculating the distance between 2 the start point of the trajectory to the current point.This function does the calculation and creates a column called Distance_start_to_curr and places it in the dataframe and returns it.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which calculation is to be performed.
- Returns
The dataframe containing the resultant Distance_start_to_curr column.
- Return type
pandas.core.dataframe
- static end_location_helper(dataframe, ids_)[source]
This function is the helper function of the get_end_location(). The get_end_location() function delegates the task of calculating the end location of the trajectories in the dataframe because the original functions runs multiple instances of this function in parallel. This function finds the end location of the specified trajectory IDs the DF and then another returns dataframe containing end latitude, end longitude and trajectory ID for each trajectory
- dataframe: PTRAILDataFrame
The dataframe of which the locations are to be found.dataframe
- ids_: list
List of trajectory ids for which the end locations are to be calculated
- Returns
New dataframe containing Trajectory ID as index and latitude and longitude as other 2 columns.
- Return type
pandas.core.dataframe.Dataframe
- static end_time_helper(dataframe, ids_)[source]
This function is the helper function of the get_end_time(). The get_end_time() function delegates the task of calculating the end_time of the trajectories in the dataframe because the original functions runs multiple instances of this function in parallel. This function finds the end time of the specified trajectory IDs the DF and then another returns dataframe containing end latitude, end longitude, DateTime and trajectory ID for each trajectory
- Parameters
dataframe (PTRAILDataFrame) – The dataframe containing the original data.
ids (list) – List of trajectory ids for which the end times are to be calculated
- Returns
New dataframe containing Trajectory ID as index end time of all trajectories.
- Return type
pandas.core.dataframe.Dataframe
- static number_of_location_helper(dataframe, ids_)[source]
This is the helper function for the get_number_of_locations() function. The get_number_of_locations() delegates the actual task of calculating the number of unique locations visited by a particular object to this function. This function calculates the number of unique locations by each of the unique object and returns a dataframe containing the results.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe containing all the original data.
ids (list) – The list of ids for which the number of unique locations visited is to be calculated.
- Returns
dataframe containing the results.
- Return type
pandas.core.dataframe.DataFrame
- static point_within_range_helper(dataframe, coordinates, dist_range)[source]
This is the helper function for create_point_within_range() function. The create_point_within_range_column()
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the operation is to be performed.
coordinates (tuple) – The coordinates from which the distance is to be checked.
dist_range – The range within which the distance from the coordinates should lie.
- Returns
The dataframe containing the resultant Within_X_m_from_(x,y) column.
- Return type
pandas.core.dataframe.DataFrame
- static start_location_helper(dataframe, ids_)[source]
This function is the helper function of the get_start_location(). The get_start_location() function delegates the task of calculating the start location of the trajectories in the dataframe because the original functions runs multiple instances of this function in parallel. This function finds the start location of the specified trajectory IDs the DF and then another returns dataframe containing start latitude, start longitude and trajectory ID for each trajectory
- dataframe: PTRAILDataFrame
The dataframe of which the locations are to be found.dataframe
- ids_: list
List of trajectory ids for which the start locations are to be calculated
- Returns
New dataframe containing Trajectory as index and latitude and longitude
- Return type
pandas.core.dataframe.Dataframe
- static start_time_helper(dataframe, ids_)[source]
This function is the helper function of the get_start_time(). The get_start_time() function delegates the task of calculating the start_time of the trajectories in the dataframe because the original functions runs multiple instances of this function in parallel. This function finds the start time of the specified trajectory IDs the DF and then another returns dataframe containing start latitude, start longitude, DateTime and trajectory ID for each trajectory
- dataframe: PTRAILDataFrame
The dataframe containing the original data.
- ids_: list
List of trajectory ids for which the start times are to be calculated
- Returns
New dataframe containing Trajectory ID as index and start time of all trajectories.
- Return type
pandas.core.dataframe.Dataframe
- static traj_duration_helper(dataframe, ids_)[source]
Calculate the duration of the trajectory i.e. subtract the max time of the trajectory by the min time of the trajectory.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe containing all the original data.
ids (list) – A list containing all the Trajectory IDs present in the dataset.
- Returns
The resultant dataframe containing all the trajectory durations.
- Return type
pandas.core.dataframe.DataFrame
- static visited_poi_helper(df, surrounding_data, dist_column_label, nearby_threshold)[source]
Given a Trajectory dataframe and another dataset with the surrounding data, find whether the given object is nearby a point of interest or not.
- Parameters
df – The dataframe containing the trajectory data.
surrounding_data – The dataframe containing the data of the surroundings.
dist_column_label (Text) – The label of the column containing the distance of the coords from the nearest POI.
nearby_threshold (int) – The maximum distance between the POI and the current location of the object within which the object is considered to be crossing/visiting the POI.
- Returns
The original dataframe with another column added to it indicating whether
each point is within
ptrail.features.kinematic_features module
The spatial_features module contains several functions of the library that calculates kinematic features based on the coordinates of points provided in the data. This module mostly extracts and modifies data collected from some existing dataframe and appends these information to them. Inspiration of lots of functions in this module is taken from the PyMove library.
References
Arina De Jesus Amador Monteiro Sanches. “Uma Arquitetura E Imple-menta ̧c ̃ao Do M ́odulo De Pr ́e-processamento Para Biblioteca Pymove”.Bachelor’s thesis. Universidade Federal Do Cear ́a, 2019
- class ptrail.features.kinematic_features.KinematicFeatures[source]
Bases:
object
- static create_acceleration_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Create a column containing acceleration of the object from the previous to the current point.
Note
The acceleration calculated here is the acceleration between 2 consecutive points of the same trajectory. Furthermore, the acceleration yielded is in metres/second^2 (m/s^2).
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the calculation of acceleration is to be done.
- Returns
The dataframe containing the resultant Acceleration_prev_to_curr column.
- Return type
- static create_bearing_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Create a column containing bearing between 2 consecutive points. Bearing is also referred as “Forward Azimuth” sometimes. Bearing/Forward Azimuth is defined as follows:
Bearing is the horizontal angle between the direction of an object and another object, or between the object and the True North.
Note
The bearing calculated here is the bearing between 2 consecutive points of the same trajectory. Furthermore, the bearing yielded is in degrees.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the bearing is to be calculated.
- Returns
The dataframe containing the resultant Bearing_from_prev column.
- Return type
- static create_bearing_rate_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Calculates the bearing rate of the consecutive points. And adding that column into the dataframe
Note
The bearing calculated here is the bearing between 2 consecutive points of the same trajectory. Furthermore, the bearing yielded is in degrees/second.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the bearing rate is to be calculated
- Returns
The dataframe containing the resultant Bearing_rate_from_prev column.
- Return type
- static create_distance_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Create a column called Dist_prev_to_curr containing distance between 2 consecutive points. The distance calculated is the Great-Circle (Haversine) distance.
Note
When the trajectory ID changes in the data, then the distance calculation again starts from the first point of the new trajectory ID and the distance-value of the first point of the new Trajectory ID will be set to 0.
Note
The Distance calculated here is the distance between 2 consecutive points of the same trajectory. Furthermore, the distance yielded is in metres (m).
- Parameters
dataframe (PTRAILDataFrame) – The data where distance is to be calculated.
- Returns
The dataframe containing the resultant Distance_prev_to_curr column.
- Return type
- static create_distance_from_point_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, coordinates: tuple)[source]
Given a point, this function calculates the distance between that point and all the points present in the dataframe and adds that column into the dataframe.
Note
The distance yielded here is in metres.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which calculation is to be done.
coordinates (tuple) – The coordinates from which the distance is to be calculated.
- Returns
The dataframe containing the resultant Distance_from_(x, y) column.
- Return type
- static create_distance_from_start_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Create a column containing distance between the start location and the rest of the points using Haversine formula. The distance calculated is the Great-Circle distance.
Note
When the trajectory ID changes in the data, then the distance calculation again starts from the first point of the new trajectory ID and the first distance of the new trajectory ID will be set to 0.
Note
The Distance calculated here is the distance between the start point and the current points of the same trajectory. Furthermore, the distance yielded is in metres (m).
- Parameters
dataframe (PTRAILDataFrame) – The data where distance is to be calculated.
- Returns
The dataframe containing the resultant Distance_start_to_curr column.
- Return type
- static create_jerk_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Create a column containing jerk of the object from previous to the current point.
Note
The jerk calculated here is the jerk between 2 consecutive points of the same trajectory. Furthermore, the jerk yielded is in metres/second^3 (m/s^3).
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the calculation of jerk is to be done.
- Returns
The dataframe containing the resultant jerk_prev_to_curr column.
- Return type
- static create_point_within_range_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, coordinates: tuple, dist_range: float)[source]
Check how many points are within the range of the given coordinate by first making a column containing the distance between the given coordinate and rest of the points in dataframe by calling create_distance_from_point() and then comparing each point using the condition if it’s within the range and appending the values in a column and attaching it to the dataframe.
Note
The dist_range parameter is given in metres.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the point within range calculation is to be done.
coordinates (tuple) – The coordinates from which the distance is to be calculated.
dist_range (float) – The range within which the resultant distance from the coordinates should lie.
- Returns
The dataframe containing the resultant Within_x_m_from_(x,y) column.
- Return type
- static create_rate_of_br_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Calculates the rate of bearing rate of the consecutive points. And then adding that column into the dataframe.
Note
The rate of bearing rate calculated here is the rate of bearing rate between 2 consecutive points of the same trajectory. Furthermore, the bearing yielded is in degrees.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the rate of bearing rate is to be calculated
- Returns
The dataframe containing the resultant Rate_of_bearing_rate_from_prev column
- Return type
- static create_speed_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Create a column containing speed of the object from the previous point to the current point.
Note
When the trajectory ID changes in the data, then the speed calculation again starts from the first point of the new trajectory ID and the speed of the first point of the new trajectory ID will be set to 0.
Note
The Speed calculated here is the speed between 2 consecutive points of the same trajectory. Furthermore, the speed yielded is in metres/second (m/s).
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the calculation of speed is to be done.
- Returns
The dataframe containing the resultant Speed_prev_to_curr column.
- Return type
- static distance_travelled_by_date_and_traj_id(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, date, traj_id)[source]
Given a date and trajectory ID, calculate the total distance covered in the trajectory on that particular date.
Note
The distance yielded is in metres (m).
- Parameters
dataframe (PTRAILDataFrame) – The dataframe in which teh actual data is stored.
date (Text) – The Date on which the distance covered is to be calculated.
traj_id (Text) – The trajectory ID for which the distance covered is to be calculated.
- Returns
The total distance covered on that date by that trajectory ID.
- Return type
float
- Raises
KeyError: – Traj_id is not present in the arguments passed.
- static generate_kinematic_features(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Generate all the Kinematic features with a single call of this function.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the features are to be generated.
- Returns
The dataframe enriched with Kinematic Features.
- Return type
- static get_bounding_box(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Return the bounding box of the Trajectory data. Essentially, the bounding box is of the following format:
(min Latitude, min Longitude, max Latitude, max Longitude).
- Parameters
dataframe (PTRAILDataFrame) – The dataframe containing the trajectory data.
- Returns
The bounding box of the trajectory
- Return type
tuple
- static get_distance_travelled_by_traj_id(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: str)[source]
Given a trajectory ID, calculate the total distance covered by the trajectory. NOTE: The distance calculated is in metres.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe containing the entire dataset.
traj_id (Text) – The trajectory ID for which the distance covered is to be calculated.
- Returns
The distance covered by the trajectory
- Return type
float
- Raises
MissingTrajIDException: – The Trajectory ID given by the user is not present in the dataset.
- static get_end_location(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: Optional[str] = None)[source]
Get the ending location of an object’s trajectory in the data.
Note
If the user does not give in any traj_id, then the library, by default gives out the end locations of all the unique trajectory ids present in the data.
- Parameters
dataframe (PTRAILDataFrame) – The PTRAILDataFrame storing the trajectory data.
traj_id – The ID of the trajectory whose end location is to be found.
- Returns
tuple – The (lat, longitude) tuple containing the end location.
pandas.core.dataframe.DataFrame – The dataframe containing start locations of all trajectory IDs.
- static get_number_of_locations(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: Optional[str] = None)[source]
Get the number of unique coordinates in the dataframe specific to a trajectory ID.
Note
If no Trajectory ID is specified, then the number of unique locations in the visited by each trajectory in the dataset is calculated.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe of which the number of locations are to be computed
traj_id (Text) – The trajectory id for which the number of unique locations are to be found
- Returns
int – The number of unique locations in the dataframe/trajectory id.
pandas.core.dataframe.DataFrame – The dataframe containing start locations of all trajectory IDs.
- static get_start_location(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id=None)[source]
Get the starting location of an object’s trajectory in the data.
Note
If the user does not give in any traj_id, then the library, by default gives out the start locations of all the unique trajectory ids present in the data.
- Parameters
dataframe (PTRAILDataFrame) – The PTRAILDataFrame storing the trajectory data.
traj_id – The ID of the object whose start location is to be found.
- Returns
tuple – The (lat, longitude) tuple containing the start location.
pandas.core.dataframe.DataFrame – The dataframe containing start locations of all trajectory IDs.
ptrail.features.temporal_features module
References
Arina De Jesus Amador Monteiro Sanches. “Uma Arquitetura E Imple-menta ̧c ̃ao Do M ́odulo De Pr ́e-processamento Para Biblioteca Pymove”.Bachelor’s thesis. Universidade Federal Do Cear ́a, 2019
- class ptrail.features.temporal_features.TemporalFeatures[source]
Bases:
object
- static create_date_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
From the DateTime column already present in the data, extract only the date and then add another column containing just the date.
- Parameters
dataframe (PTRAILDataFrame) – The PTRAILDataFrame Dataframe on which the creation of the time column is to be done.
- Returns
The dataframe containing the resultant Date column.
- Return type
- static create_day_of_week_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Create a column called Day_Of_Week which contains the day of the week on which the trajectory point is recorded. This is calculated on the basis of timestamp recorded in the data.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe containing the entire data on which the operation is to be performed
- Returns
The dataframe containing the resultant Day_of_week column.
- Return type
- static create_time_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
From the DateTime column already present in the data, extract only the time and then add another column containing just the time.
- Parameters
dataframe (PTRAILDataFrame) – The PTRAILDataFrame Dataframe on which the creation of the time column is to be done.
- Returns
The dataframe containing the resultant Time column.
- Return type
- static create_time_of_day_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Create a Time_Of_Day column in the dataframe using parallelization which indicates at what time of the day was the point data captured. Note: The divisions of the day based on the time are provided in the utilities.constants module.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the calculation is to be done.
- Returns
The dataframe containing the resultant Time_Of_Day column.
- Return type
References
Arina De Jesus Amador Monteiro Sanches. ‘Uma Arquitetura E Imple-menta ̧c ̃ao Do M ́odulo De Pr ́e-processamento Para Biblioteca Pymove’.Bachelor’s thesis. Universidade Federal Do Cear ́a, 2019.
- static create_weekend_indicator_column(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Create a column called Weekend which indicates whether the point data is collected on either a Saturday or a Sunday.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the operation is to be performed.
- Returns
The dataframe containing the resultant Weekend column.
- Return type
References
Arina De Jesus Amador Monteiro Sanches. ‘Uma Arquitetura E Imple-menta ̧c ̃ao Do M ́odulo De Pr ́e-processamento Para Biblioteca Pymove’.Bachelor’s thesis. Universidade Federal Do Cear ́a, 2019.
- static generate_temporal_features(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame)[source]
Generate all the temporal features with a single call of this function.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the features are to be generated.
- Returns
The dataframe enriched with Temporal Features.
- Return type
- static get_end_time(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: Optional[str] = None)[source]
Get the ending time of the trajectory.
Note
If the trajectory ID is not specified by the user, then by default, the ending times of all the trajectory IDs in the data are returned.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the operations are to be performed.
traj_id (Optional[Text]) – The trajectory for which the end time is required.
- Returns
pandas.DateTime – The end time of a single trajectory.
pandas.core.dataframe.DataFrame – Pandas dataframe containing the end time of all the trajectories present in the data when the user hasn’t asked for a particular trajectory’s end time.
- static get_start_time(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: Optional[str] = None)[source]
Get the starting time of the trajectory.
Note
If the trajectory ID is not specified by the user, then by default, the starting times of all the trajectory IDs in the data are returned.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe on which the operations are to be performed.
traj_id (Optional[Text]) – The trajectory for which the start time is required.
- Returns
pandas.DateTime – The start time of a single trajectory.
pandas.core.dataframe.DataFrame – Pandas dataframe containing the start time of all the trajectories present in the data when the user hasn’t asked for a particular trajectory’s start time.
- static get_traj_duration(dataframe: ptrail.core.TrajectoryDF.PTRAILDataFrame, traj_id: Optional[str] = None)[source]
Accessor method for the duration of a trajectory specified by the user.
Note
If no trajectory ID is given by the user, then the duration of each unique trajectory is calculated.
- Parameters
dataframe (PTRAILDataFrame) – The dataframe containing the resultant column if inplace is True.
traj_id (Optional[Text]) – The trajectory id for which the duration is required.
- Returns
pandas.TimeDelta – The trajectory duration.
pandas.core.dataframe.DataFrame – The dataframe containing the duration of all trajectories in the dataset.