X Tutup
Skip to content

Add compiler-support for deriving Debug type class#4390

Closed
JordanMartinez wants to merge 6 commits intopurescript:masterfrom
JordanMartinez:jam/derive-debug
Closed

Add compiler-support for deriving Debug type class#4390
JordanMartinez wants to merge 6 commits intopurescript:masterfrom
JordanMartinez:jam/derive-debug

Conversation

@JordanMartinez
Copy link
Contributor

Description of the change

See purescript/purescript-prelude#272 for more context.

By adding compiler support, it means most other types can get a Debug instance by just deriving it. Moreover, it means adding support across core libraries would be easy because each type could have its instance derived.

This implementation does not add support for deriving opaque types (e.g. foreign import data X :: Type) because one will likely want to write those types' Debug instances by hand.


Checklist:

  • Added a file to CHANGELOG.d for this PR (see CHANGELOG.d/README.md)
  • Added myself to CONTRIBUTORS.md (if this is my first contribution)
  • Linked any existing issues or proposals that this pull request should close
  • Updated or added relevant documentation
  • Added a test for the contribution (if applicable)

@rhendric
Copy link
Member

What does this get us over genericDebug? I think if instance implementations can be provided through generics, we should prefer that over more custom derived classes.

@natefaubion
Copy link
Contributor

natefaubion commented Sep 16, 2022

I think if instance implementations can be provided through generics, we should prefer that over more custom derived classes.

I don't have a strong opinion on this, but anecdotally I would say most questions in Discord around deriving (and Show for example) are "If the compiler can derive it, why do I need Generic?". That is, I would say most newcomers find the trip through Generic arbitrary and somewhat annoying. Supporting it in the compiler also arguably gives it a "blessing". Also, Eq and Ord are counter examples.

@rhendric
Copy link
Member

If the verbosity of the required declarations is the problem, I'd rather see #3824 resumed, or whatever other syntactic improvements would benefit instance deriving in general.

Isn't bringing Debug into the prelude enough of a blessing—does it really need to be baked into the compiler for people to take it seriously?

I don't mind grandfathering in Eq and Ord—I wouldn't even mind seeing Hashable brought in alongside them—because equality-related operations generally need to be performant. Not so with debugging printers, I don't think.

I don't want to come across as super-dogmatic about this either; it just seems to me like we should hold some line against implementing everything in the compiler, and I don't know if this makes the cut.

@f-f
Copy link
Member

f-f commented Sep 16, 2022

I think an interesting question at this point is: can we provide something more than what Generic can do if we have Debug derived on the compiler side? If yes, what would that look like?

@garyb
Copy link
Member

garyb commented Sep 16, 2022

Another consideration with generic vs explicitly supported deriving is runtime performance, but in this particular case I don't feel like it is a factor because Debug doesn't seem like something that would/should be used when performance matters.

@natefaubion
Copy link
Contributor

If the verbosity of the required declarations is the problem, I'd rather see #3824 resumed, or whatever other syntactic improvements would benefit instance deriving in general

Does deriving via help? Honestly to me it feels like you need an even more obscure feature to use and explain to newcomers to get basic debugging support. Deriving via is neat, but it comes with a lot of implicit knowledge around newtypes and coercions directing instance resolution. My main concern is that this is meant to be a pretty pervasive inclusion, and it’s also one of the first things newcomers want to do. I think reducing that friction as much as possible, and making it feel straightforward and non intimidating is as justifiable as performance.

@rhendric
Copy link
Member

I think reducing that friction as much as possible, and making it feel straightforward and non intimidating is as justifiable as performance.

That's a solid point. But the way I see it, the barrier to being able to point newcomers at generic implementations of things like Debug is all of the friction described in #3426. We can bypass those issues for Debug by making it derivable by the compiler, and if this is causing new users an unacceptable amount of short-term pain maybe we should. But in the long term, wouldn't it be better to continue down the road of addressing all of those points in the general case? (Since that issue was written, AFAIK we've made instance names optional and that's it.) Deriving via would be progress along that road, but it's not the only way I could imagine skinning this cat.

For example, if we had #3067, we could make things even more straightforward (instance Debug YourType versus derive instance Debug YourType).

Our Project Values talk about preferring fewer powerful general-purpose features over many special cases, and the desirability of solving problems downstream of the compiler. My concern here is that we may be shying away from those values because the design of the general-purpose features seems daunting, whereas we all know how to implement a new compiler-derived class. But these proposals are years old and most of them are following stable language features in Haskell; we should not be afraid to move forward on them. And doing so will benefit larger parts of the PureScript experience than this one specific class.

@natefaubion
Copy link
Contributor

Is there a path to making a Generic derived implementation as straightforward to use as stock deriving? Honestly, it kind of drives me nuts that the path we don't want to take is the path with the most straightforward syntax and semantics for downstream users. Haskell has default signatures and implementations, which helps a lot. But we have no proposal for defaults because it's a notorious foot-gun even in Haskell land.

I know that I said I didn't have a strong opinion, but even:

data Foo = ... deriving (Generic, Debug via Debug.Generically Foo)

Is frankly, completely obtuse to a newcomer compared to:

data Foo = ... deriving Debug

When it comes to just wanting to show something in the repl.

The ideal situation for me would be to have just that, but still let it be derived via Generics, possibly through an ad-hoc solved instance if Generic is not otherwise declared.

@JordanMartinez
Copy link
Contributor Author

For context, I opened this PR partly to see what discussion would result from doing that. This was my thinking: while I could migrate purescript-debugged and its instances into prelude and across the core libraries using Generic, it would be easier to do that if I could just write derive instance Debug <TypeName> and call it a day. I was expecting to get pushback, but I just wasn't sure what that pushback would be.

Having read through these comments, I'm actually surprised that no one made an objection to this PR on the basis that purescript-debugged isn't well-tested/used yet. If we wanted to change that library's API, it would be a bigger breaking change by implementing compiler support here than if we just changed it in prelude. (Arguably, I'm not sure that matters that much since compiler and prelude breaking changes are typically done together, but I did want to point that out.)

I'll respond to some of the above comments now.

We can bypass those issues for Debug by making it derivable by the compiler, and if this is causing new users an unacceptable amount of short-term pain maybe we should. But in the long term, wouldn't it be better to continue down the road of addressing all of those points in the general case?

I do agree with Ryan that deriving Debug by the compiler only after finding that the "Deriving Via"-approach was too much pain for new users is a decision that better upholds our project values in general. It seems that most people coming to PureScript already have some FP behind them, so would using the Deriving Via Generic syntax really be an issue? While we can assume yes, we don't have harder data than that.

data Foo = ... deriving (Generic, Debug via Debug.Generically Foo)
vs
data Foo = ... deriving Debug

Still, I agree with Nate that the second version above is easier for newcomers. Those unfamiliar with FP languages will both 1) find it easier to understand the second over the first and 2) be able to use Debug sooner before the Deriving Via feature is explained.

When it comes to just wanting to show something in the repl.

While the REPL would benefit from this, it's actually not my main intent here. I feel like Try PureScript is the REPL of choice because it also allows one to run code on a browser, the main target of PureScript.

The ideal situation for me would be to have just that, but still let it be derived via Generics, possibly through an ad-hoc solved instance if Generic is not otherwise declared.

This is an interesting idea. If I'm understanding you correctly, writing

data Foo = ... deriving Debug

would implicitly become

data Foo = deriving (Generic, Debug via Debug.Generically Foo)

If I don't explicitly derive Generic, should the compiler do that for me? Is this too much "magic"?

Having said all that, I think I agree with Ryan more in that this shouldn't be done, at least not yet. I'd like to first verify that the API defined by the purescript-debugged library works well and that Debug gets enough adoption before further "blessing" it with compiler support.


On a different note, I'd like to propose a new general compiler feature based on Nate's idea above. This would entail a lot of work, so it's not quite relevant as to whether this deriving Debug should be supported or not. However, it's an idea that could resolve some of the issues raised in this PR.

What if the data Foo = deriving (Generic, Debug via Debug.Generically Foo) syntax was just the following? I'll use the term "stock-deriving type class syntax" to refer to syntax that looks like deriving SomeTypeClass or deriving (TypeClass1, TypeClass2).

data Foo = ... deriving (Generic, Debug)

If we had annotations, what if we could define a type class like this? And then all type classes that don't need special compiler support (e.g. Debug, Enum, Semigroup, etc) can use "stock-deriving type class syntax" to derive those instances by using the corresponding genericX functions referenced by that annotation?

class Enum a where
 @derivable Data.Enum.Generic.genericPred
 pred :: a -> Maybe a
 @derivable Data.Enum.Generic.genericSucc
 succ :: a -> Maybe a

Whenever the compiler comes across a type class that is being derived using "stock-deriving type class syntax", it either

  1. derives a type class with actual compiler support (e.g. Eq, Ord, Hashable)
  2. looks up the corresponding Generic implementation based on the class member's annotations and uses that

The end result is syntax that

  1. looks like "stock" deriving without requiring knowledge of newtype/coercion mechanics of Deriving Via (Nate's mentioned concern)
  2. without "blessing" a performance-irrelevant type class, Debug, that can otherwise be implemented via genericDebug (Ryan's concern, Gary's point)
  3. and only adding a few more errors that are otherwise actionable (explained below).

Rather than "blessing" this type class, this idea would implement a general compiler feature that is otherwise beneficial to multiple type classes, not just Debug.

Let's assume we're a new user. Here's the error message "workflow" I might go through when I try this out for the first time:

module Foo where

data Foo
  = ...
  deriving Debug

Compiler error 1: Debug can only be derived if the type, Foo, also derives Generic`.

module Foo where

import Data.Generic.Rep (class Generic)

data Foo
  = ...
  deriving (Generic, Debug)

Compiler error 2: In type class, Debug, type class member, debug, can only be derived for the type, Foo, if 'Data.Debug.Generic (genericDebug)' is also imported.

module Foo where

import Data.Generic.Rep (class Generic)
import Data.Debug.Generic (genericDebug)

data Foo
  = ...
  deriving (Generic, Debug)

And then the code works.

One potential downside of this idea is where the GenericDebug/genericDebug code would live. To prevent orphan instances, type class instances for some types defined in a dependency (e.g. Maybe) will need to be defined in the same module that declares the class. If those instances are derived via Generic, then the genericMember identifier will need to be in the same module as the class.
In other words, we currently have a Data.Debug module that defines the Debug type class and a Data.Debug.Generic module that implements genericDebug via Generic. If instances in Data.Debug are derived via Generic, we would need to move the Data.Debug.Generic module into the Data.Debug module to ensure this still worked. And that might actually be a good design decision (that a genericX is always defined in the same module as the type class whose member it implements) as then we could define a type class with annotated members as:

module Data.Enum where

class Enum a where
 @derivable genericPred
 pred :: a -> Maybe a
 @derivable genericSucc
 succ :: a -> Maybe a

genericPred :: ...

genericSucc :: ...

And then we could write:

module Data.Foo where

import Prelude

import Data.Generic.Rep (class Generic)

-- Assuming `genericDebug` wasn't included in `import Prelude`
import Data.Debug (genericDebug)
import Data.Enum (class Enum, genericPred, genericSucc)

data Foo
  = ...
  deriving (Eq, Ord, Generic, Debug, Enum)

@natefaubion
Copy link
Contributor

natefaubion commented Sep 17, 2022

I do agree with Ryan that deriving Debug by the compiler only after finding that the "Deriving Via"-approach was too much pain for new users is a decision that better upholds our project values in general. It seems that most people coming to PureScript already have some FP behind them, so would using the Deriving Via Generic syntax really be an issue? While we can assume yes, we don't have harder data than that.

Having experience with FP does not necessarily correlate with knowledge of typeclasses, instance resolution, newtypes, deriving, or Generics. If your knowledge of FP is largely based on JS, TS, Elm, Elixir, Erlang, F#, etc then you probably are not familiar with any of those mechanisms. Of languages that do support typeclasses, deriving, and Generics (or some meta equivalent), how many require something comparable (the requisite familiarity with newtypes and deriving via, which is still an obscure feature that only exists in 1 other language behind a pragma) to this to get basic debugging support? I'm genuinely curious, though I have a strong hunch the answer is 0.

The anecdotal data I do have is years of directing confused newcomers how to get Show support. After all, it's as simple as deriving Show in Haskell, for those that are familiar with it. It is not obvious to me at all that deriving via is any better than the status quo in that regard.

While the REPL would benefit from this, it's actually not my main intent here. I feel like Try PureScript is the REPL of choice because it also allows one to run code on a browser, the main target of PureScript.

Is this a significant distinction in practice? I think "REPL" can be a stand-in for any introductory environment.

If I don't explicitly derive Generic, should the compiler do that for me? Is this too much "magic"?

To be clear, I'm not suggesting the compiler opt the user into a Generic instance that is generally available. If the compiler can generate a Generic instance, then it can solve an ad-hoc Generic constraint (like it does for IsSymbol) for the purposes of deriving such that it doesn't leak, which could provide you with a uniform deriving mechanism that does not require users to know about Generics or deriving via.

@rhendric
Copy link
Member

Dumb question: why do we want users to specify Debug at all in any way on their types? If we wanted, surely we could introduce a compiler-solved AdHocDebug (bikeshed to be painted later) class, whose solution would involve first looking for a Debug instance and, if one isn't found, doing what this PR does (presumably) to make one locally, recursively solving AdHocDebug constraints on inputs. Then every type would be printable out of the box. What would we be giving up in exchange for that level of frictionlessness, which would surely satisfy all those JS/TS/Elm/etc. refugees better than anything else proposed thus far?

(Not sarcasm or irony; this is a genuine question and if we can't answer it then I support this as a plan of action, contra my previous stance. If there is some principle that makes this a bad idea then I want to see how that principle applies to the other proposals.)

@natefaubion
Copy link
Contributor

Dumb question: why do we want users to specify Debug at all in any way on their types? If we wanted, surely we could introduce a compiler-solved AdHocDebug (bikeshed to be painted later) class, whose solution would involve first looking for a Debug instance and, if one isn't found, doing what this PR does (presumably) to make one locally, recursively solving AdHocDebug constraints on inputs. Then every type would be printable out of the box. What would we be giving up in exchange for that level of frictionlessness, which would surely satisfy all those JS/TS/Elm/etc. refugees better than anything else proposed thus far?

I think in general we've landed on the side of "users should opt-in to typeclass semantics", though we've certainly violated that for Coerce ergonomics. I don't think there's a technical reason it's not possible since we have an orphan restriction.

@garyb
Copy link
Member

garyb commented Sep 17, 2022

I was going to suggest something similar actually. A class for debug printing I'd be 100% okay with it being magical where you never have to declare anything for it, and instances essentially come into existence on demand (to avoid bloating the JS with unused instances - although I guess that matters less with the tooling we have available now).

It doesn't even have to be a class mechanism, maybe it's just a function in Prim (I realise we don't have any of those so far) that will accept any value and produce a debug value for it, but just something that works without any need for the user to opt in.

@f-f
Copy link
Member

f-f commented Sep 17, 2022

I have been annoyed at Rust for having to add a Debug instance for just printing stuff out, so I would love an automagic deriving of this. Though it might be weird to have to implement this manually in some cases (e.g. when there are functions involved, etc)

@JordanMartinez
Copy link
Contributor Author

why do we want users to specify Debug at all in any way on their types?

I'll answer this from the perspective of using purescript-debugged's implementation. Generally, we don't. This PR would work for 80% of types using the constructor implementation. However, there are times when specific types would benefit from using an alternative representation than what would be derived here. Unfortunately, I can't know which types those are, and thus there would still be times when a developer would manually write a Debug instance for a type.

Below are the kinds of types where the debug implementation derived in this PR would not be the best:

@rhendric
Copy link
Member

It doesn't even have to be a class mechanism, maybe it's just a function in Prim (I realise we don't have any of those so far) that will accept any value and produce a debug value for it, but just something that works without any need for the user to opt in.

I don't think that'd work as well—imagine we provide a Prim.autoDebug :: forall a. a -> Debug.Repr and users want to wrap that in some forall a. a -> Effect Unit that writes to a log file or something. The ‘determine the concrete type and build a debugger for it’ logic would need to be pushed out from the call site of autoDebug, and a class strikes me as the natural way to do that.

However, there are times when specific types would benefit from using an alternative representation than what would be derived here.

Yeah totally, that's why I proposed a different class in addition to Debug. Optionally implement Debug for custom control, but the compiler-solved AdHocDebug (maybe AutoDebug?) would be what functions and other instances consume, and would always have an instance for any concrete type. It might be a bit unusual to duplicate a class like this but it makes explicit where instances are coming from: instances for Debug come from code, instances for AdHocDebug come from the compiler (and sometimes delegate immediately to an available Debug). If the proposal were to use only one class, it would comprise a new, third category between classes that are always user-written—or at least explicitly derived—and classes that are always provided by the compiler, and I'm trying to introduce as little novelty as possible.

@MonoidMusician
Copy link
Contributor

I'm slowly coming around to adding some compiler deriving or compiler magic for Debug. (I upvoted Ryan's first comment at the time.)

I think an automatically derived Debug would be very nice to use, if the user can override it by implementing it for their types manually, and I think we'd want to generate warnings when instances are automagically derived.

The other advantage of not requiring Generic is that Generic opens up your ADT to inspection by anyone, since we don't have a mechanism to restrict typeclasses to local scope.

But other than that, compiler deriving won't/can't do anything substantial that Generic wouldn't. (Ease of use, of course – but that's not substantial in the sense of having different behavior in running code.)

FWIW I don't like deriving syntax. But I'm still on team “explicit instance names” so maybe you shouldn't take my word 😝

Jordan said:

Having read through these comments, I'm actually surprised that no one made an objection to this PR on the basis that purescript-debugged isn't well-tested/used yet.

That was at the back of my mind. I honestly hadn't had a good look at it until it came up the other day.

One option to alleviate design concerns is that we could have typeclasses-on-typeclasses, and be generic over the debugging output type 😂 That is, instead of class Debug myType we would have

class (DebugStruct out) => Debug out myType where
  debugInto :: myType -> out

where DebugStruct contains the generic interface methods we'd want on debugging representations.

Nate said:

I think in general we've landed on the side of "users should opt-in to typeclass semantics", though we've certainly violated that for Coerce ergonomics

We did provide ways for users to opt-in to Coerce semantics, through roles and private constructors, because it's really hard to make it ergonomic/convincing/predictable by just letting users declare instances. So if it is a violation it's because the design space is tightly constrained and something had to give …

@JordanMartinez
Copy link
Contributor Author

Ryan said:

Yeah totally, that's why I proposed a different class in addition to Debug. Optionally implement Debug for custom control, but the compiler-solved AdHocDebug (maybe AutoDebug?) would be what functions and other instances consume, and would always have an instance for any concrete type. It might be a bit unusual to duplicate a class like this but it makes explicit where instances are coming from: instances for Debug come from code, instances for AdHocDebug come from the compiler (and sometimes delegate immediately to an available Debug). If the proposal were to use only one class, it would comprise a new, third category between classes that are always user-written—or at least explicitly derived—and classes that are always provided by the compiler, and I'm trying to introduce as little novelty as possible.

Yeah, I like this idea of AutoDebug and Debug. Code-wise, you're saying something like:

import Data.Debug.Type as DT
import Data.Debug as D

class AutoDebug a where
  debug :: a -> DT.Repr

-- Works as though "backtracking" was enabled
instance (D.Debug a) => AutoDebug a where
  debug = D.debug
else instance AutoDebug a where
  debug = <compiler implementation>

-- Code works on AutoDebug
diff :: forall a. AutoDebug a => a -> a -> Diff
diff = ...

-- Foo uses AutoDebug
data Foo = Foo

data Bar = Bar

-- Bar's AutoDebug uses its custom Debug instance
instance D.Debug Bar where
  debug = ...

Verity said:

I think we'd want to generate warnings when instances are automagically derived.

Why would you want warnings here?

One option to alleviate design concerns is that we could have typeclasses-on-typeclasses, and be generic over the debugging output type 😂

Due to 😂, are you being sarcastic? I can't tell 😄. On another note, that does raise the question as to what those out types would be. purescript-debugged uses Repr because it allows one to print the value in various ways (e.g. print, diff, etc.).

@MonoidMusician
Copy link
Contributor

I didn't mean it sarcastically, I just expect that most people will think it's overkill :)

Re warnings: so that users understand it is automagic behavior that shouldn't be relied on, meaning that they should probably write or upstream the instances if they want them, because otherwise someone could come along and write an instance that silently changes its behavior! Of course you probably shouldn't be relying on its behavior at runtime anyways, but your tests are more likely to rely on it, you know?

Also I'm not sure that having a separate Debug and AutoDebug class is tenable implementation-wise, would have to see. I think it would be simpler to just have one class (that's particularly the scenario in which case I would like warnings to be generated).

@i-am-the-slime
Copy link
Contributor

One thing I personally find irritating is the name Debug.
If this is all implicit and hidden away in the compiler, I guess the name matters less.

However, if I'm a newcomer and I come from a language where debugging actually means using a debugger it's not clear that deriving Debug is really more about print-line-debugging (a.k.a. dupa-debugging) than anything else. Since we do have features to debug now via sourcemaps there is potential for confusion here.

So I would like to suggest a name like Trace, Print, or DebugPrint.

Sorry for putting another door into the bike shed.

@JordanMartinez
Copy link
Contributor Author

However, if I'm a newcomer and I come from a language where debugging actually means using a debugger it's not clear that deriving Debug is really more about print-line-debugging (a.k.a. dupa-debugging) than anything else. Since we do have features to debug now via sourcemaps there is potential for confusion here.

I disagree that this would be confusing for a few reasons:

  1. A debugger usually means setting breakpoints and whatnot before running a program. Perhaps one could incorrectly assume that adding this type class somehow helps with that. However, even if one did assume that, setting such breakpoints and then running the program would quickly reveal that this is not the case. So, if one did become confused by this, I don't think such confusion would last long. After it doesn't work as expected, one would likely ask a question on the Discord.
  2. I would imagine we would have docs on Debug that would clarify any misunderstanding a new user might have. Since hovering over the type class would display that documentation, again I don't think this potential confusion would last long.
  3. Once a new user understands how type classes work, I don't see how they could think this would be anything but print-line-debugging. Since derive instance Eq X is taught and used pretty quickly in one's learning of the language, I think derive instance Debug X would reinforce that Debug just provides an eq-like function called debug.

As for why I think this should still be named Debug:

  • a similar functionality in Rust is called that, so there's some precedence there.
  • Print just feels like a less-descriptive version of Debug
  • Trace in Haskell and trace in PureScript are both used for temporary debugging, but these are always in the context of some monadic computation. If the type class was Trace, what would the function name be? If debug, why isn't the function named after the type class? If trace, it conflicts with our existing trace. If something like traceValue, I would find it weird to use trace $ traceValue MyValue when I want to debug some code using trace.
  • DebugPrint feels like a more verbose version of Debug. Moreover, this name also feels like it limits the scope of what one can do. debug produces a representation of the underlying value. While you can print that to a String, you could also "print" it to HTML and display it in the browser via Halogen/React. At that point, it's not so much a print-line-debugging as much as it is a general "show me the internal contents of this structure and let me do what I need with that information."

@i-am-the-slime
Copy link
Contributor

@JordanMartinez
Thanks, you make some good points! I will stop arguing for a different name.

@JordanMartinez
Copy link
Contributor Author

Closing as this hasn't had any feedback since it's been opened. Moreover, the PRs I submitted to core libraries haven't received any feedback either. While I don't think this idea is dead, I do think the current approach isn't the one we'll take.

@JordanMartinez JordanMartinez deleted the jam/derive-debug branch March 4, 2023 03:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

X Tutup