Skip to content

Conversation

@5yler
Copy link
Contributor

@5yler 5yler commented Jun 26, 2023

No description provided.

@5yler 5yler changed the base branch from sw.parse.default_value to master June 26, 2023 21:18
@5yler 5yler changed the base branch from master to sw.parse.default_value June 26, 2023 21:18
@5yler 5yler force-pushed the sw.parse.default_value branch from 463d42d to 8be4a16 Compare June 28, 2023 02:48
Base automatically changed from sw.parse.default_value to master June 28, 2023 03:00
Comment on lines +421 to +427
# Step 2: Add random time offset to "Date" column
if "Date" in df.columns:
df["Date"] = pd.to_datetime(df["Date"])
random_offset = pd.tseries.offsets.DateOffset(
years=np.random.randint(-1000, 1000)
)
df["Date"] += random_offset
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be applied to all the dataframes equally so there is still time correspondence

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all reflections get shifted by a random year offset?

# and apply a random linear transformation to numerical columns
# and replace non-empty strings with a random word
for col in df.columns:
if df[col].dtype == "object":
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object means string? is there any way to narrow this down further?

df[col] = df[col].apply(lambda x: random_word() if x else x)
elif df[col].dtype in ["int64", "float64"]:
k = random.uniform(0.5, 10) * random.choice([-1, 1])
df[col] = df[col] * k
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's a rating type, won't this be obvious given our current fixed range? what is this protecting from?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants