plaidfuji
a year ago
I work with chemical and materials companies on full stack data capabilities and interactive viz / dashboarding is a recurring pain point. Frequently there are larger scale processes (beyond a single engineer’s or scientist’s scope) that still require scientific-level interactive viz that the likes of PowerBI or Tableau can’t (or won’t?) provide… if your company even has a subscription. Things like being able to dynamically re-group/nest variables and recalculate statistical tests, dual axis capabilities, automatic sig fig number reformatting, just all kinds of quality-of-life features that in some cases are possible but in most cases are considered extreme edge cases and require too much manual config / aren’t templatizable.
Of course on the other end you’ve got the whole Python/matplotlib/seaborn/bokeh/plotly/vega/altair “ecosystem” (although it’s more of a swamp if you ask me), which require someone to maintain Python code and a means to stand up an internal server. Not to mention that most use cases require significant customization. Plotly Dash always seems somewhat promising but as someone below mentioned it’s actually kind of slow? Every time I try it I’m just kind of underwhelmed.
I hear ggplot in R is good but I’ve never used R and it’s hard to get a critical mass of people in a company behind R so that’s kind of off the table.
The only programs that really get the aesthetics of scientific plotting right without a ton of customization are JMP, Origin, and Igor Pro (props if you’ve heard of it), but these are all desktop apps… although JMP is starting to make a push into cloud-hosted stuff.
I guess all that is to say if anyone is interested in starting a company in this space, let me know.
cactusfrog
a year ago
I’m working in biotech. I think creating “curated” materialized views with dbt + datawarehouse or duck db + using webgl with d3 will be best long term solution https://blog.scottlogic.com/2020/05/01/rendering-one-million... .
Matplotlib works well for static plots. Altair and others freeze at around 4000 data points, which is crazy. Streamlit + matplotlib is impossible to maintain but is quick to get up and running.
mjoin
a year ago
Exactly. This is the exact stack I envision to be the future in that space. The work of Scott Logic is awesome (and the company looks really nice as well! People and values)
plaidfuji
a year ago
I completely agree with the entire stack. I’ve basically been learning d3 for this exact reason - the primitives are so intuitive and I can tell I’ll be able to make what I want. And yeah streamlit is so close to being useful but just not quite there. But isn’t plotly built on d3?
hantusk
a year ago
I think https://github.com/uwdata/mosaic is really promising here. See the example https://idl.uw.edu/mosaic/examples/linear-regression.html where the user can recalculate a linear regression based on their selection.
You'd still need to implement any custom selection widgets, data transformations (like other statistical tests) etc. still missing, but i like the technical design to build on top off. It uses https://github.com/observablehq/plot under the hood, which aims to have just as flexible a grammar as ggplot (already quite capable) but with interactive features (built by the creator of d3 and uses it under its hood).
hatmatrix
a year ago
Igor Pro is nice but the underlying language which grew out of a set of macros is pretty rough to work with (by default works via side effects and mutation of global state, though there are ways to contain them). However, I know people who have built some nice GUIs on top of it. While it does have HDF5 I/O integration, there's no memory mapping and it will choke on larger data sets from what I recall.
I've heard promising things about Makie [1] in Julia; there is also capability to build a dashboard called Genie [2] (and a commercial dashboard builder [3]) though not sure if Makie and Genie play nicely together at the moment.
[1] https://docs.makie.org/ [2] https://genieframework.com/ [3] https://info.juliahub.com/blog/create-low-code-apps-on-julia...
puterproblems
a year ago
I agree with cactusfrog that d3 is a step in the right direction for dataviz. It really is a swamp out there, with twenty or so different ways to do similar things but usually not quite what you want (or at least not as easy as you want it to be)! I've been researching dataviz out of curiosity and annoyance with the current state of dataviz software for a while before this post popped up and would be interested in integrating a lot of the past decade-or-so of open-source work in this space to a tool for full stack data viz. I'd love to work with as many engineers who do data viz as I can! You can get in touch with me via email at puterproblems [at] proton.me if you're interested (and to find out more about me as I'm aware this is a brand new account :D).
analog31
a year ago
Doesn't R require coding too? And my recollection of Igor Pro was from long ago (on a 68k Mac), but it also required coding in its bespoke scripting language. In fact I walked away from it for that exact reason... I wasn't going to spend brain cells on somebody's proprietary language.
I think that for things like dashboards, we're still stuck between "code" and "no code" tools. I don't know of a happy medium.
slashdave
a year ago
Vega/Altair is a declarative grammar, so I wouldn't dump it in with the others. It's also convenient because it reduces to json (easy to store) and has TypeScript libraries for native presentation in a browser.