Tejaswi's blog

Streetlights per ward in Bengaluru

Streetlights per ward in Bengaluru # Uncomment to install geopandas if not already installed # !pip install geopandas import geopandas as gpd import pandas as pd # Load data lights = pd.read_csv("C:\\Users\\ADMIN\\Downloads\\streetlights-in-bengaluru-wards.csv") world = gpd.read_file("BBMP.geojson") # Plot the world GeoDataFrame world.plot() # Convert 'KGISWardName' columns to string world['KGISWardName'] = world['KGISWardName'].astype(str) lights['KGISWardName'] = lights['KGISWardName'].astype(str) lights['Street lights'] = lights['Street lights'].astype(int) # Merge the DataFrames on 'KGISWardName' result = pd.merge(world, lights, on='KGISWardName', how='inner') result....

Single Period Option Pricing

Single Period Option Pricing The basic model of option pricing is based on the binomial Pricing model. A no-arbitrage argument can be made against an option price being equivalent to the portfolio of a bond and the stock. (The payoff diagrams would be similar) The generalisation of the binomial pricing model for multiple periods is the black scholes formula

Kelly criterion and betting

I recently came across (and actually understood this time) , arbitrage opportunities in FX exchange rates. So there is a graph algorithm which when used against different FX pairs, can tell us if riskless bets can be made. Trying to apply the same to horse betting seemed obvious at first glance, but since there is no clear ‘graph’ we can exploit here, it took some time for me to understand how it is possible....

Bugs in applications & pinned plans in queries

Bugs in applications & pinned plans in queries At my place of work, one of the ways non optimal plans are handled , is by gathering stats. The other method is to use SQL profiles, by ‘pinning’ plans to queries. Oracle also supports SQL baselines another method of doing a similar thing. This leads to problems as plans which work today, might not work two years down the line. (Think table stats evolution over time)....

Sampling And Confidence Intervals

Beyond ratios : Sampling & confidence intervals Problem: Suppose we have 100 log lines, each line with a different severity level. INFO, WARN, SEVERE. Since processing all of them might be expensive, How do we sample a proportion of these log lines? What can we say about the ‘population’ of the log lines from this sample? Confidence intervals It lets us make statements such as ‘with x% ’level of confidence’ the number of severe lines in the overall population will be between y and z....

Scoring functions for breaking cryptopals ciphers

The problem is to find a scoring function. A function which assigns a number to the likeliness of a message to be english. What I tried for the score computing function Most of them based on ‘ETAOIN SHRDLU’ (frequency analysis) Position based weighting (higher score is better) def scorecompute(msg): score = 0 positions = 'ETAOIN SHRDLU' for i in positions: weight = len(positions) - positions.find(i) score += msg.upper().count(i) * weight return score doesn’t work so well From this point on higher scores are worse....

Azure Work Items And Org mode

The following post is about viewing data from azure work items (ticket id, title etc..) and managing org notes by tagging them. Setting up az cli & az devops extension for az cli azure has a cli and for azure devops there is an extension . #after installing azure cli & cli devops extension #login az devops login Viewing tickets in org mode (using az wiql) This snippet does two things...

Thoughts on Viewing Logs

log viewing What I tried Logview mode emacs Notepad++ Notepad++ supports ‘User Defined Languages’ for key word highlighting, folding etc.. Wrote a log search gui in tkinter Tried this first , until I discovered notepad++ has all of the features I added built into it. How I wanted to go about it Regular expression for the tokens and parsing for the log format Turns out you don’t need that level of configuration See 1 timestamp tz LEVEL message etc....

Disabling SSL while using Git and pip

SSL errors If you get errors like : unable to get local issuer certificate error or Failed to connect to github.com port 443 after 21380 ms: Timed out while using git OR WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘ConnectTimeoutError while using pip , It could mean you are behind a corporate proxy. Behind tools such as ZScaler. A quick fix to this (although not the safest way) is to disable ssl for Git...

Using git with large mono repos

The following approaches can be used to work with large git repositories. Limit large files from being downloaded. using ‘filter’ So files such as large binaries, zip files won’t be downloaded. #size in MB git clone --no-checkout --filter=blob:limit=<size> your_git_repo.com Limit git to a set of directories This can be done using git ‘sparse checkout’ #inialize empty git repo mkdir your_git_dir; cd your_git_dir git init # set which sub-directory contents you want git sparse-checkout set --cone input-directory #add remote and do a pull git remote add origin my_git_repo....