Scraping

INSTALL NVIDIA DRIVERS, CUDA, TENSORFLOW AND ANACONDA ON UBUNTU 20.04

August 28, 2022August 28, 2022
Operating System, Python, Scraping, Tools, Ubuntu

Quick note on how to install the packages. It is in no particular order, these are raw notes my from my temporary build

Install Ubuntu

Open a terminal
sudo bash
apt-get update
apt-get dist-upgrade
nvidia-smi

Pick the version you want to use
apt install nvidia-utils-510
reboot

Open a terminal
sudo bash
nvidia-smi
ubuntu-drivers autoinstall
apt-get install curl

Go to the Anaconda site and download the latest build. For me it was the one below
cd Downloads/
bash Anaconda3-2021.11-Linux-x86_64.sh
source ~/.bashrc
conda info
conda update conda
conda update anaconda
sudo apt install nvidia-cuda-toolkit
reboot

Open a terminal
sudo bash
anaconda-navigator

Open a new notebook
pip install –upgrade tensorflow
Press run

Open a new notebook
import tensorflow as tf
print(tf.version)

Tensorflow will show the version

SCRAPE HISTORICAL FINRA SHORT DATA

May 5, 2021May 5, 2021
Scraping, Stock Market, Ubuntu

To quickly get all the data from this page and all the sub pages:

You will need to run the following on a UBUNTU box

Open a terminal

mkdir finra-historical && mkdir all-years && mkdir downloads && cd finra-historical/downloads

wget -r -np -c -H https://www.finra.org/filing-reporting/trf/trf-regulation-sho-2020

after the scrape completes type

mv regsho.finra.org ../allyears

cd ../

rm -R downloads

This will move only the downloads you want, over to the directory called all-years, and it will delete all the data that is extra

SCRAPE DAILY SHORT DATA FROM FINRA

May 5, 2021May 5, 2021
Operating System, ParrotOS, Power BI, Raspberry Pi, Raspbian, Scraping, Stock Market, Ubuntu

Here is a quick how-to on getting the daily short data. From this I look to see if the short volume has increased or decreased on a particular stock by leveraging a Power BI Dashboard that I had created. This guide is to show the simple command I run to initially scrape all the data

From a Linux box

Open a terminal

Type:
mkdir FINRA
cd FINRA
wget -r -np http://regsho.finra.org/regsho-Index.html

After it downloads the data I run
rm *.html

To remove the additional files that are not relevant, and then I move the data to where I need it

EXPORT TELEGRAM CHAT HISTORY FOR DATA ANALYTICS

April 22, 2021April 20, 2021
Scraping

From time to time you come across very valuable telegram channels with great information in them. You can run analytics on the chats, to identify trends and make decisions based on the data. To export the data perform the following:

Download and install the Telegram Desktop App
Login using your phone number and the code that Telegram sends you in the Phone App
Click on the Telegram Channel that you want to export
Click on the three dots at the top right of the chat window
Select Export Chat History
Place a check mark in all of the boxes, set the size limit to 2000MB and then click export
This will give you all the data.
If you want chats only, make sure you do not place checkmarks in all the extra fields
Click Export
Click Allow from your phone (yes you need to use both the desktop app and your phone app)
After 24 hours or so the Data will be available for you to download

SCRAPING YAHOO FINANCE FOR ALL STOCK DATA

April 18, 2021May 13, 2023
Python, Raspberry Pi, Scraping, Stock Market

I found a great little snippet of code over on Kaggle,

and needed to build a system to run it from
https://tacticalware.com/jupyter-installation-on-a-headless-raspberry-pi-4-running-ubuntu-20-10/

Once you build the system as I did in my guide above you will need to do one other thing to get it to run
Open Putty
SSH to your Jupyter Pi
Login
Type:
sudo bash
pip instal yfinance

Now on another computer you can open your Jupyter notebook at
http://Jupyter:8888

Then go back to

Click File
Click Download
It will give you a file with the extension of ipynb

You can then go back to your Jupyter notebook and upload it

Once you upload it, run the notebook and it should scrape about 2.5GB of data, direct for your viewing pleasure

Hardware that I used:
Raspberry Pi 4 (4gb)
https://amzn.to/3q551IO

Plugable USB C to M.2 NVMe Tool-free Enclosure
https://amzn.to/3lflV3L

CanaKit 3.5A Raspberry Pi 4 Power Supply (USB-C)
https://amzn.to/3fNTYPu

CanaKit Raspberry Pi 4 Micro HDMI Cable – 6 Feet
https://amzn.to/33u5hr9

Western Digital 500GB WD_Black SN750 NVMe
https://amzn.to/3nZ5pH4

JUPYTER INSTALLATION ON A HEADLESS RASPBERRY PI 4 RUNNING UBUNTU 20.10

April 18, 2021May 13, 2023
Python, Scraping, Stock Market

More and More, I am finding my way to Jupyter Notebooks. And guides online are scarce to come by, especially when you are running it on a Raspberry Pi 4. Below is a mixture of a few guides that I had found, and this is what worked for me.

For my setup I am running a Raspberry Pi 4 from a NVME M.2 Drive connected via USB 3.0. I am NOT using the slow micro SD card. To have the same setup as me, follow this guide:
https://tacticalware.com/boot-raspberry-pi-4-from-m-2-usb-drive/

After your Pi recognizes the M.2 NVME Drive, I then use Raspberry Pi Imager to write Ubuntu Server 20.10 to the USB Drive. Then I plug it into the Pi, to get ready for the steps below

Now boot your Raspberry Pi 4 and Login
Then
sudo bash
apt-get update
apt-get dist-upgrade
apt-get install net-tools
sudo hostnamectl set-hostname Jupyter
reboot
sudo bash
sudo rm /usr/bin/python
sudo ln -s /usr/bin/python3 /usr/bin/python
sudo apt-get install python3-pip
sudo pip3 install –upgrade pip
sudo apt-get install npm
sudo npm install -g configurable-http-proxy
sudo -H pip3 install notebook jupyterhub
jupyterhub –generate-config
sudo mv jupyterhub_config.py /root
nano /root/jupyterhub_config.py
ctrl +w
http://:8000
change to http://:8888
uncomment and delete space on this line c.JupyterHub.bind_url = ‘http://:8888’
ctrl +x
y
jupyterhub -f /root/jupyterhub_config.py
nano /lib/systemd/system/jupyterhub.service
[Unit]
Description=JupyterHub Service
After=multi-user.target

[Service]
User=root
ExecStart=/usr/local/bin/jupyterhub –config=/root/jupyterhub_config.py
Restart=on-failure

[Install]
WantedBy=multi-user.target

ctrl + x to exit
y to save

sudo systemctl daemon-reload
sudo systemctl start jupyterhub
sudo systemctl enable jupyterhub
sudo systemctl status jupyterhub.service

on a different computer
open an internet browser
http://Jupyter:8888
login as any user

Back on the Raspberry Pi
sudo -H pip3 install testresources
sudo -H pip3 install jupyterlab
jupyter serverextension enable –py jupyterlab –system
nano /root/jupyterhub_config.py
uncomment and modicy the c.Spawner.default line to reflect the following:
c.Spawner.default_url = ‘/lab’
ctrl + x to exit
y to save
sudo apt-get install libatlas-base-dev
y
sudo -H pip3 install numpy
pip3 install seaborn
pip install Cython numpy
pip install cmake

sudo addgroup jupyter_admin
sudo adduser tacticalware
sudo usermod -aG jupyter_admin tacticalware
nano /root/jupyterhub_config.py
ctrl +x to exit
y to save
Add this to the end of the file
c.PAMAuthenticator.admin_groups = {‘jupyter_admin’}
ctrl + x to exit

You are now finished. On the other computer
Open an internet browser
http://Jupyter:8888
And play!

Hardware that I used:
Raspberry Pi 4 (4gb)
https://amzn.to/3q551IO

Plugable USB C to M.2 NVMe Tool-free Enclosure
https://amzn.to/3lflV3L

CanaKit 3.5A Raspberry Pi 4 Power Supply (USB-C)
https://amzn.to/3fNTYPu

CanaKit Raspberry Pi 4 Micro HDMI Cable – 6 Feet
https://amzn.to/33u5hr9

Western Digital 500GB WD_Black SN750 NVMe
https://amzn.to/3nZ5pH4