Scottobear's Blogspot

Experimental stuff... you can't beat free hosting, can ya?

My Photo
Name:
Location: North Beach, MD, United States

Sunday, May 17, 2020

W.O.M.O.M.F.G

Weaponeers of Monkaa Outlandish Mini Figure Gohlem, with black paint details. W.O.M.O.M.F.G. What is cooler than a sweet glyos robot? One in the color scheme in the M.U.S.C.L.E. men flesh tones. Note the ab piece is removed so you can have him extra stubby to match the style, too. The custom omfg logo on the chest is also a spiffy variant. #weaponeersofmonkaa #spymonkeycreations #gohlem #womomfggohlem #octobertoys #designercon2014 #Onelldesigns https://bit.ly/2WFEtkL

Labels: , ,

Friday, May 17, 2019

May 17, 2019 at 05:42PM

Air & Space Museum. via Instagram http://bit.ly/2Jt8X3V

Labels: , , , ,

May 17, 2019 at 05:42PM

Air & Space Museum. via Instagram http://bit.ly/2Jt8X3V

Labels: , , , ,

May 17, 2019 at 05:42PM

Air & Space Museum. via Instagram http://bit.ly/2Jt8X3V

Labels: , , , ,

May 17, 2019 at 05:42PM

Air & Space Museum. via Instagram http://bit.ly/2Jt8X3V

Labels: , , , ,

Tuesday, May 17, 2016

… something wicked this way comes.

via Instagram http://bit.ly/1V8nLDI

Labels: , , , ,

By the pricking of my thumbs ….

via Instagram http://bit.ly/1swE16J

Labels: , , , ,

CF11 issue

I’m trying to find an efficient way of pulling in a website’s metadata keywords (from the <meta> tag). Server’s running CF11

So far I’ve tried using the CFHTTP tag to pull in the data, but based on what I’m reading online people don’t seem to recommend using regular expressions for this task. The alternative seems to involve finding or building some sort of HTML parser, but I haven’t found any that work well, and I don’t have control over the server so I’m not able to install anything on it. I looked into using ColdFusion’s XMLPARSE, but that doesn’t seem to be what I’m after either.

The websites I’m going to pull this data from are not standardized, so I can’t rely on the <meta name=”keywords” {…} /> tag to be in the same format every time. It could be missing, it could have the name at the front, or at the end, the end could be />, but it could be just >

Any tips on how to do this without using too much processing power? I am looking for a solution that is efficient. The result should just be a string of keywords found on the website I point it at.

You want to look at jsoup

https://jsoup.org

Add the jar to your CF server and you can very easily use it for parsing HTML.

It uses a selector syntax very similar to jQuery which makes it really easy and powerful.

Try parsing the HTML as XML and look it up with xpath expressions.

I tried storing the entire page as a string and parsing it using XMLParse(), but the function doesn’t seem to be designed to make it easy for you to traverse through the HTML DOM structure or whatever and pull out the information you want. For this I was sort of hoping to find something similar to a jquery select statement that finds the object you want and allows you to easily pull out whatever information you’re looking for. I need a server-side solution though, so I can’t use client-side stuff.

Do you mean a different approach than the one I took though? I am not familiar with xpath expressions, not sure how to approach the problem from that angle, but will read up on xpath expressions  tomorrow, thanks!

There are a LOT of other libraries out there in other languages that can traverse through elements like jquery can using element, ID, and class selectors.

I’ve done it in Ruby, PHP, and C#. I’m not aware of any for CFML.

XPath is not HTML specific, it’s how to select and traverse XML nodes by element name, attributes, etc. Should be pretty easy if you’re just looking to get Meta tags.

Labels: , ,