How to handle missing data in Torch?

1 year ago

Ava Mitchell

1 minute

There are various methods in Torch for handling missing data, here are some commonly used methods:

Remove missing data: You can filter out missing data using the torch.masked_select() function, keeping only the non-missing data. For example:

data = torch.tensor([1, 2, float('nan'), 4, float('nan')])
mask = torch.isnan(data)
filtered_data = torch.masked_select(data, ~mask)
print(filtered_data)

Replace missing data: You can use the torch.where() function to replace missing data with a specified value. For example:

data = torch.tensor([1, 2, float('nan'), 4, float('nan')])
mask = torch.isnan(data)
filled_data = torch.where(mask, torch.tensor(0), data)
print(filled_data)

Fill missing data using interpolation method: You can use the torch.interp() function to interpolate and fill missing data. For example:

data = torch.tensor([1, 2, float('nan'), 4, float('nan')])
mask = torch.isnan(data)
indices = torch.arange(len(data))
interpolated_data = torch.interp(indices, indices[~mask], data[~mask])
print(interpolated_data)

These methods allow for choosing the appropriate way to handle missing data based on the specific situation.