Wednesday, November 18, 2020

Tip: Data retrieval with Python

 

If you're not familiar with Python, you should learn. It's a powerful programming language that can extract data from websites. And there are many code libraries for doing that retrieval. 

The Google Colaboratory website allows you to easily run Python code from the browser:

https://colab.research.google.com/notebooks/intro.ipynb#recent=true

A good introduction to retrieving Yahoo data is here:

https://medium.com/@realmistic/use-your-computer-to-make-informed-decisions-in-stock-trading-practical-introduction-part-2-832e6f67f590

For example, a program as simple as:

!pip install yfinance
import yfinance as yf
pfe = yf.Ticker('PFE')
pfe.info

...will allow access to all of this data:

{'52WeekChange': 0.025267482,
 'SandP52WeekChange': 0.16119564,
 'address1': '235 East 42nd Street',
 'algorithm': None,
 'annualHoldingsTurnover': None,
 'annualReportExpenseRatio': None,
 'ask': 36.55,
 'askSize': 4000,
 'averageDailyVolume10Day': 82095557,
 'averageVolume': 31018400,
 'averageVolume10days': 82095557,
 'beta': 0.641488,
 'beta3Year': None,
 'bid': 36.52,
 'bidSize': 1800,
 'bookValue': 11.744,
 'category': None,
 'circulatingSupply': None,
 'city': 'New York',
 'companyOfficers': [],
 'country': 'United States',
 'currency': 'USD',
 'dateShortInterest': 1604016000,
 'dayHigh': 37.42,
 'dayLow': 36.3,
 'dividendRate': 1.52,
 'dividendYield': 0.0422,
 'earningsQuarterlyGrowth': -0.714,
 'enterpriseToEbitda': 13.11,
 'enterpriseToRevenue': 5.206,
 'enterpriseValue': 253279617024,
 'exDividendDate': 1604534400,
 'exchange': 'NYQ',
 'exchangeTimezoneName': 'America/New_York',
 'exchangeTimezoneShortName': 'EST',
 'expireDate': None,
 'fiftyDayAverage': 35.149593,
 'fiftyTwoWeekHigh': 39.83871,
 'fiftyTwoWeekLow': 26.451612,
 'fiveYearAverageReturn': None,
 'fiveYearAvgDividendYield': 3.62,
 'floatShares': 5548891741,
 'forwardEps': 2.84,
 'forwardPE': 12.788733,
 'fromCurrency': None,
 'fullTimeEmployees': 88300,
 'fundFamily': None,
 'fundInceptionDate': None,
 'gmtOffSetMilliseconds': '-18000000',
 'heldPercentInsiders': 0.00046999997,
 'heldPercentInstitutions': 0.70032,
 'industry': 'Drug Manufacturers—General',
 'isEsgPopulated': False,
 'lastCapGain': None,
 'lastDividendDate': 1604534400,
 'lastDividendValue': 0.36053,
 'lastFiscalYearEnd': 1577750400,
 'lastMarket': None,
 'lastSplitDate': 1605571200,
 'lastSplitFactor': '1054:1000',
 'legalType': None,
 'logo_url': 'https://logo.clearbit.com/pfizer.com',
 'longBusinessSummary': 'Pfizer Inc. develops, manufactures, and sells healthcare products worldwide. It offers medicines and vaccines in various therapeutic areas, including cardiovascular metabolic and pain under the Eliquis, Chantix/Champix, and Premarin family brands; biologics, small molecules, immunotherapies, and biosimilars under the Ibrance, Sutent, Xtandi, Xalkori, Inlyta, Braftovi + Mektovi brands; and sterile injectable and anti-infective medicines under the Sulperazon, Medrol, Vfend, and Zithromax brands. The company also provides medicines and vaccines in various therapeutic areas, such as pneumococcal disease, meningococcal disease, and tick-borne encephalitis under the Prevnar 13/Prevenar 13 (pediatric/adult), FSME-IMMUN, Nimenrix, and Trumenba brands; biosimilars for chronic immune and inflammatory diseases under the Xeljanz, Enbrel, Inflectra, and Eucrisa brands; and amyloidosis, hemophilia, and endocrine diseases under the Vyndaqel/Vyndamax, BeneFIX, Genotropin, and Refacto AF/Xyntha brands. In addition, the company is involved in the contract manufacturing business. It serves wholesalers, retailers, hospitals, clinics, government agencies, pharmacies, and individual provider offices, as well as disease control and prevention centers. The company has collaboration and/or co-promotion agreements with Bristol-Myers Squibb Company and Astellas Pharma US, Inc.; a licensing agreement with Akcea Therapeutics, Inc; a strategic alliance with Verily Life Sciences LLC; collaboration agreements with Merck KGaA and Valneva SE; a clinical trial collaboration and supply agreement with IDEAYA Biosciences, Inc.; a material transfer and collaboration agreement with BioNTech SE; a clinical supply collaboration with Jiangsu Alphamab Biopharmaceuticals Co., Ltd; a research collaboration and license agreement with BioInvent International AB; and a multi-target drug discovery collaboration with Sosei Group Corporation. Pfizer Inc. was founded in 1849 and is headquartered in New York, New York.',
 'longName': 'Pfizer Inc.',
 'market': 'us_market',
 'marketCap': 201881092096,
 'maxAge': 1,
 'maxSupply': None,
 'messageBoardId': 'finmb_162270',
 'morningStarOverallRating': None,
 'morningStarRiskRating': None,
 'mostRecentQuarter': 1601164800,
 'navPrice': None,
 'netIncomeToCommon': 8686000128,
 'nextFiscalYearEnd': 1640908800,
 'open': 36.93,
 'openInterest': None,
 'payoutRatio': 0.96769994,
 'pegRatio': -12.73,
 'phone': '212 733 2323',
 'previousClose': 36.04,
 'priceHint': 2,
 'priceToBook': 3.092643,
 'priceToSalesTrailing12Months': 4.1497483,
 'profitMargins': 0.17854,
 'quoteType': 'EQUITY',
 'regularMarketDayHigh': 37.42,
 'regularMarketDayLow': 36.3,
 'regularMarketOpen': 36.93,
 'regularMarketPreviousClose': 36.04,
 'regularMarketPrice': 36.93,
 'regularMarketVolume': 62576363,
 'revenueQuarterlyGrowth': None,
 'sector': 'Healthcare',
 'sharesOutstanding': 5558400000,
 'sharesPercentSharesOut': 0.0083,
 'sharesShort': 45925049,
 'sharesShortPreviousMonthDate': 1601424000,
 'sharesShortPriorMonth': 45163995,
 'shortName': 'Pfizer, Inc.',
 'shortPercentOfFloat': 0.0083,
 'shortRatio': 1.93,
 'startDate': None,
 'state': 'NY',
 'strikePrice': None,
 'symbol': 'PFE',
 'threeYearAverageReturn': None,
 'toCurrency': None,
 'totalAssets': None,
 'tradeable': False,
 'trailingAnnualDividendRate': 1.5,
 'trailingAnnualDividendYield': 0.041620422,
 'trailingEps': 1.538,
 'trailingPE': 23.615084,
 'twoHundredDayAverage': 34.76005,
 'volume': 62576363,
 'volume24Hr': None,
 'volumeAllCurrencies': None,
 'website': 'http://www.pfizer.com',
 'yield': None,
 'ytdReturn': None,
 'zip': '10017'}

Wednesday, September 2, 2020

Old Blog post from Randy

 On September 2, 2020 Randy stated:

Alert: No further updates to add-in

At list point, it's looking that there will be no further updates to the add-in. Several reasons:

  1. Too many resources have taken actions to prevent something like the add-in from accessing their data. Mostly by having dynamically created web pages with the data sourced from protected API files.
  2. Microsoft is making it more and more difficult to interact with the Internet via VBA. One situation in particular is that the IE object within EXCEL is no longer as functional as it was. And they've let it lapse to the point that a lot of websites don't even recognize it as a browser, so it can no longer be used to log into the site to get subscriber data.
  3. I've reached the point in my life were I'm pretty much a passive investor, so I no longer have much need of the add-in for financial data. These days, I use it more to automate extraction of data from non-financial websites.
  4. My available time and inclination to work on the add-in has decreased significantly in the last year, partially due to my frustration with the first two reasons.
Most of the more recent functions just use the building block functions like RCHGetWebData() and smfGetTagContent(), with the utility functions smfWord() and smfStrExtr(), so those new functions could be replicated in a spreadsheet. In fact, the newer functions generally come out of me making a prototype spreadsheet with those building block functions and then adapting it to a new function that does the same thing directly.

Although I started the add-in about 20 years, it had stand-alone predecessors in one language or another for about 15 years prior to that. So it's been a while...

I'll still be answering questions in the forums, but may obsolete the Yahoo Groups in the near future. It has gotten to be a bit chaotic to moderate since Yahoo abandoned the groups concept. And it would be better to have discussions on groups.io or Google Groups, where the discussions are archived.

Tuesday, May 19, 2020

What is the smfForceRecalculation macro and how do I use it?

 ---In smf_addin@yahoogroups.com, <rharmelink@...> wrote :

OverviewWhen the RCHGetElementNumber() and RCHGetTableCell() and RCHGetWebData() functions are invoked, they actually extract data from a copy of the web page that was saved by the add-in.  This is done to make the functions run more quickly, because the actual retrieval of a web page from the Internet is the part of the process that takes the most time.  For example, if you were to get 20 or 30 data items from the same web page, this process allows the web page to be retrieved once and then all extractions are done from that single retrieval of the web page from the Internet.  Otherwise, each invocation of those functions would need to grab a new web page.

In general, the process operates as follows:

  • The "saved array" is first checked, to see if the web page the data is being extracted from has already been retrieved from the Internet.
  • If an entry in the "saved array" is found, the data is extracted from the saved copy of the web page.
  • If an entry is not found in the "saved array", the source code of the web page is retrieved from the Internet and then put into the "saved array", which is indexed by URL.  Then, the data is extracted from that saved copy of the web page.
What does the smfForceRecalculation macro do?

When you run this macro, it purges the "saved array" of web pages and then triggers a full recalculation within EXCEL.  Since the "saved array" of web pages has been emptied out, the add-in functions that use the "saved array" of web pages will no longer find a saved copy of a web page, and be forced to retrieve a fresh copy of the web page.  Note, however, that ONLY the first function invocation that uses a given URL will go to the Internet -- because it will save a new copy of the web page, which all other functions using that URL will use instead of going to the Internet to get the web page again.

In addition, this macro will also force all of the RCHGetElementNumber() element definitions to be reloaded.  That was actually the original reason the macro was written -- so I could modify an element definition and tell the add-in to reload the new element definitions and test them to see if the changes worked correctly.

How do I run the smfForceRecalculation macro?

You can use the keyboard shortcut alt-F8 to bring up EXCEL's macro dialog window.  Unfortunately, this window does NOT list macros that are part of an add-in.  That means you need to type in the macro name.  When you complete entering the macro name, you should see the various buttons activate.  You then click "Run" to run the macro.

I have actually assigned the macro to a keyboard shortcut (done from the "Options..." button, which is located several buttons below the "Run" button used above).  I also have added a button to my Quick Access Toolbar in EXCEL 2007 that runs the macro.  It is what I use most often.  In fact, I usually click on it each time before I open a new workbook, to reset the "saved array" of web pages for that workbook.  But you probably don't want to do that if you have a number of workbooks open that use add-in functions, since the macro will force ALL add-in functions in ALL workbooks to recalculate.

Additional Notes

The RCHGetYahooQuotes() and RCHGetYahooHistory() functions do NOT use the "saved array" of web pages.  That's because they both return an entire set of data with each invocation of the function.  So there is no advantage to saving the web page -- since all possible data is extracted from a single Internet access.  In fact, Yahoo is returning CSV files, not web pages.  So the functions just need to parse the CSV files and place the data into the range they were given access to when the formulas were array-entered.

Note, however, that the smfForceRecalculation macro WILL make those functions grab new copies of the CSV files and parse them out.

What do I need in addtion to smfForceRecalculation to get my data to refresh?

 

---In smf_addin@yahoogroups.com, <rharmelink@...> wrote :

Another possible reason data is not refreshing is that you have your Internet Options set up to cache web pages. Since EXCEL and the add-in use the same Internet access engine as IE, it also utilizes IE's cookies and "Temporary Internet Files" folder.

To check/change your page caching settings for the "Temporary Internet Files" folder:
  1. Click on your system's "Start" button.
  2. Click on the "Control Panel" option.
  3. Double-click on "Internet Options".
  4. In the "Browser settings" section, click on the "Settings" button.
  5. Under where it says "Check for newer versions of stored pages:", you want to toggle the "Every time I visit the web page" radio button. Without this, the add-in would just retrieve IE's stored copy of the web page instead of the Internet access engine getting a fresh copy of the web page from the Internet.
  6. Click "OK" to save the settings and exit
  7. Click "OK" to exit
  8. Close the "Control Panel" window.
Did you need to change your setting there?

Monday, March 9, 2020

New ZIP file -- Version 2020.03.09

 Because of the previously noted issues with Microsoft's changes to VBA's EVALUATE() function, I had to add some special processing for the add-in's use of that function.

So I just uploaded a new version of the ZIP archive.



New files on the web site:


Best practice is to exit EXCEL when updating SMF add-in files.

SMF new Add-In version 3.0.2024.04.29

  This new release has fixes for: module smfGetOptionQuotes A member (Bruce) found the issue was with the crumb processing and provided cod...